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1  Introduction 


In  this  report  we  describe  a  target  detection  concept  which  uses  the  change  of  the 
complex-valued  synthetic  aperture  radar  (SAR)  signature  in  a  pixel  when  resolution 
is  varied.  The  conjecture  is  that  there  is  a  specific  signature  change  between  groups 
of  cultural  scatterers  which  is  different  from  the  signature  change  in  random  clutter 
when  resolution  is  varied.  This  change  in  signature  is  due  to  the  interference  of  the 
prominent  scatterers  as  they  contribute  to  the  signature  in  the  resolution  cell.  As 
resolution  changes,  different  target  prominent  scatterers  enter/exit  a  resolution  cell. 
Clutter,  on  the  other  hand,  will  have  large  numbers  of  equivalued  scatterers  causing 
it  to  remain  random  with  respect  to  resolution. 

Many  synthetic  aperture  radar  (SAR)  target  detection  algorithms  work  on  a  sin¬ 
gle  resolution  image  at  the  finest  available  resolution  [20],  [21].  However,  there  is 
compelling  evidence  to  suggest  that  significant  performance  gains  can  be  achieved  by 
casting  the  detection  problem  in  a  multiresolution  setting.  We  argue,  using  simple 
physical  principles,  that  this  performance  gain  is  a  direct  result  of  the  coherent  inter¬ 
ference  effects  that  occur  in  typical  radar  target  signatures  as  the  imaging  resolution 
is  varied.  We  then  apply  a  multiresolution  sampling  strategy  for  choosing  optimal 
resolutions  for  coherent  target  detection. 

SAR  images  of  man-made  objects  typically  consist  of  spatial  patterns  of  bright 
points  and  lines  resulting  from  radar  backscatter  from  discrete  physical  features  such 
as  corners,  edges,  flat  plates  and  other  primitive  geometric  shapes.  The  coherent 
radax  return  from  each  of  these  discrete  features,  or  prominent  scatterers ,  is  a  com¬ 
plex  phasor  with  amplitude  equal  to  the  local  radar  cross-section  of  the  target  fea¬ 
ture.  At  fine  enough  resolutions,  these  prominent  scatterers  are  isolated  in  individual 
resolution  cells  and  they  dominate  the  target  signature.  As  the  resolution  changes 
from  fine  to  coarse,  adjacent  scatterers  become  lumped  together  into  a  single  resolu¬ 
tion  cell  and  coherently  interfere  with  each  other,  leading  to  characteristic  changes 
in  amplitude  and  phase  as  a  function  of  resolution.  It  is  this  relationship  between 
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phase  and  amplitude  as  a  function  of  resolution  that  we  exploit  through  the  use  of  a 
multiresolution-based  detector. 

There  axe  two  potential  benefits  which  can  be  gained  from  exploiting  this  discrim¬ 
inant.  The  first  benefit  is  that  more  reliable  target  detection  (higher  probability  of 
detection  for  a  given  false  alarm  rate)  can  be  obtained.  The  second  is  that  the  re¬ 
quired  aperture  for  equivalent  target  detection  performance  to  a  non-multiresolution 
case  may  be  reduced  and  therefore,  the  system  search  rate  is  increased.  These  bene¬ 
fits  were  empirically  verified.  An  increase  in  detection  probability  of  approximately  a 
factor  of  2  was  achieved  for  a  given  false  alarm  rate  using  multiresolution  signatures. 
In  addition,  the  use  of  multiple  resolutions  attained  an  equivalent  performance  to  a 
pixel-by-pixel  fine  resolution  approach  without  the  need  for  the  fine  resolution  data. 

In  our  development  of  SAR  imaging,  we  will  assume  linear  FM  transmission  signals 
and  the  usual  far-field  assumptions  [2].  Neglecting  polar  formatting  concerns  (data 
is  assumed  interpolated  or  backprojection  processing  is  used)  and  higher  order  phase 
effects  (range  walk  and  variable  range  rate),  the  complex  image  can  be  expressed  in 
terms  of  the  scene,  s(x),  x€  IR2 ,  as 


T(x; p )  = 


(1) 

(2) 


*2  7T 


rjs£i 

AraJ 


The  quadratic  phase  term  e  1  Aral  arises  as  a  result  of  retaining  the  second  order 
term  in  the  binomial  series  expansion  of  the  distance  from  the  SAR  antenna  to  a 
specific  point  in  the  scene.  This  quadratic  term  incorporates  the  interference  (phase 
mixing)  of  the  various  scatterers  within  a  resolution  cell.  This  effect  is  usually  ac¬ 
counted  for  by  defining  a  complex  scene  c(r).  As  resolution  is  varied  the  scatterer 
interference  will  take  on  a  unique  characteristic  determined  by  the  strength  and  rel¬ 
ative  location  of  the  scatterers  within  the  resolution  cell. 

To  obtain  an  intuitive  understanding  of  the  role  that  interference  of  prominent 
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scatterers  plays  a  simple  example  is  now  presented.  The  example  is  depicted  at  the 
top  of  Figure  1 .  A  set  of  3  scatterers  are  equidistantly  placed  in  azimuth.  The  scene 
can  then  be  written  as 

3 


*(*)  =  '£2&(x-  Xi)  (3) 

i=i 

For  an  unweighted  phase  history,  the  complex  image  of  this  scene  is 


T{&  p)  = 


E 


«*!»,•  I2 

e  Ar<>  sine 


(4) 


The  magnitude  and  phase  of  this  target  signature  for  1,  2,  and  3  scatterers  are  shown 
in  the  middle  and  bottom  of  Figure  1,  respectively.  These  figures  show  the  effect  of 
interference  between  the  scatterers  when  they  enter  a  resolution  cell.  In  the  case  of 
a  single  scatterer,  no  interference  is  present  and  no  change  in  signature  is  detected 
as  a  function  of  resolution.  For  2  and  3  scatterers  the  signature  changes  significantly. 
When  each  individual  scatterer  can  be  resolved,  there  is  no  interference  between  scat¬ 
terers  (neglecting  effects  from  higher  order  sidelobes).  As  the  resolution  degrades, 
the  first  order  sidelobe  has  an  effect  on  the  signature  of  the  scatterer.  When  the  res¬ 
olution  becomes  large  enough  to  contain  all  of  the  scatterers,  the  change  in  signature 
is  most  pronounced. 

Natural  terrain  typically  consists  of  a  large  collection  of  small  amplitude  scatterers 
that  are  randomly  distributed  within  each  resolution  cell.  These  random  paths  axe 
shown  in  Figure  2.  Thus  SAR  imagery  of  terrain,  i.e.,  clutter,  is  frequently  modeled  as 
a  Gaussian  random  field  by  appealing  to  the  law  of  large  numbers  [23]  and  references 
therein.  The  result  is  that  the  amplitude  and  phase  of  a  clutter  pixel  vary  randomly 
as  a  function  of  resolution.  In  particular,  for  Gaussian  clutter  the  amplitude  has 
a  Rayleigh  distribution  with  parameter  proportional  to  the  resolution  and  clutter 
reflectivity,  while  the  phase  is  uniformly  distributed  over  [0,  2i r). 

The  resolutions  where  the  multiresolution  discriminant  is  applicable  must  be 
coaxse  enough  to  allow  multiple  prominent  scatterers  in  a  resolution  cell  but  not 
so  coarse  that  the  law  of  large  numbers  can  be  applied  for  target  signatures.  As 
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shown  in  [1],  approximately  8  scatterers  of  equal  value  in  a  resolution  cell  is  the  limit 
where  the  signature  can  be  said  to  be  nonrandom.  The  choice  of  resolutions  in  which 
the  multiresolution  feature  vector  is  constructed  must  be  chosen  appropriately. 

A  natural  concern  that  arises  when  images  of  various  resolutions  are  to  be  jointly 
processed/analyzed  is  the  correspondence  of  a  pixel  from  image  to  image.  This  prob¬ 
lem  can  be  circumvented  by  choosing  the  maximum  aperture  extent  (finest  resolution) 
a  priori,  sampling  at  a  rate  defined  by  the  Nyquist  rate  at  that  resolution  and  retain¬ 
ing  that  sampling  rate  as  the  resolution  changes.  The  images  at  coarser  resolutions 
are  effectively  oversampled.  Each  pixel  in  each  of  the  multiresolution  images  will  now 
have  a  direct  correspondence  with  each  other. 

In  section  2  we  will  develop  the  statistical  multiresolution  process  models.  This 
section  will  also  discuss  a  simple  resolution  sampling  strategy  which  whitens  the 
multiresolution  clutter  process.  Section  3  will  discuss  the  detection  strategies  that 
we  will  employ  in  our  studies.  The  detection  strategies  will  be  based  on  a  composite 
hypothesis  test  which  determines  the  presence  of  a  mean  value  in  the  process.  Special 
cases  encompassing  a  Generalized  Likelihood  Ratio  (GLRT)  approach  will  also  be 
discussed.  A  second  resolution  sampling  strategy  based  on  a  generalized  matched 
filter  is  also  developed  and  discussed  in  the  appendicies.  Section  4  discusses  the 
overall  detection  algorithm  developed  in  the  Strategic  Target  Algorithm  Research 
(STAR)  [22]  program  where  our  detection  strategies  were  inserted.  Section  5  discusses 
the  performance  results  of  our  detection  strategies  when  applied  to  simulated  data. 
Section  5  also  discusses  the  detection  results  when  the  multiresolution  based  algorithm 
was  applied  to  the  Lincoln  Laboratories  STAR  data  set  and  to  ERIM  DCS  data.  A  set 
of  appendices  are  also  given  which  review  a  multiresolution  sampling  strategy  when 
a  generalized  matched  filter  strategy  is  used  and  some  properties  of  noisy  wavelet 
transforms. 
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Res  olulion 


Figure  1:  Top:  Simple  three-scatterer  target  example.  Middle:  Magnitude  signature 
of  simple  three-scatterer  target  as  a  function  of  resolution.  Bottom:  Phase  signature 
of  simple  three-scatterer  target  as  a  function  of  resolution. 


Magnitude 


Clutter  paths 


Figure  2:  Magnitude  signature  of  random  clutter  as  a  function  of  resolution. 
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2  Multiresolution  Process  Statistics 


In  our  construction,  we  model  the  scene  c(x),  x  €  IR2  as  a  collection  of  point  scat- 
terers,  in  which  each  point  scatterer  is  specified  by  its  location  €  IR2  and  complex 
reflectivity  Uk  €  C: 

c(x)  =  ^ukS(x-  xk).  (5) 

k= 1 

This  approach  has  been  used  to  model  both  clutter  [23,  and  references  therein]  and 
objects  that  consist  of  collections  of  point  reflectors  [19,  for  instance]  (i.e.,  trihedrals 
or  comer  reflectors).  We  refer  to  c  as  the  complex  reflectivity  function. 

The  complex-valued  SAR  image,  T(cr,  p),  taking  into  account  resolution,  can  be 
written  as  a  convolution  between  the  complex  reflectivity  function,  c(x),  and  the 
system  impulse  response,  h(x),  [2]: 


(6) 


If  we  model  the  locations  {a;*.}  as  a  Poisson  point  process  with  intensity  A  (a:),  then 
c(x)  is  a  compound  point  process  or  marked  Poisson  process  with  marks  {u^}  [24, 
Chapter  3].  T(x;  p)  forms  a  filtered  Poisson  process  at  each  resolution  p  [24,  Chapter 

4]- 

In  the  next  two  subsections,  we  consider  models  for  the  statistics  of  the  observed 
radar  image,  T,  under  simple  assumptions  on  the  statistics  of  the  scatterer  locations 
{£*.}  and  their  complex  reflectivities  {«*}. 


2.1  Case  1:  Natural  Clutter 

For  natural  terrain,  i.e.,  clutter ,  a  typical  assumption  is  that  each  resolution  cell  in  the 
SAR  image  contains  a  large  number  of  small  amplitude  scatterers.  In  this  case,  we  will 
assume  that  the  complex  reflectivities  {u^}  are  independent,  identically  distributed 
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(iid)  random  variables  with  mean  zero  and  covariance  a*  I  where  I  is  the  identity 
matrix. 

Using  a  slight  generalization  of  a  result  in  [23],  we  can  invoke  the  Generalized 
Multivariate  Central  Limit  Theorem  [16]  to  show  that  the  joint  density  of  T(x]'p ) 
converges  to  multivariate  Gaussian  in  both  space  and  resolution  as  the  number  of 
scatterers  K  tends  to  infinity.  For  a  set  of  iid  Circular  Complex  r.v.  Xi, . . . ,  Xpj,  a 
set  of  arbitrary  weights  aj\, . . .  ,ajN  j  =  where 


N 


N 


^  ^  QikXki  •  • 


k= l 


y/TLi  <4  *=» 


Y^ajkXk  ~7VC(0,S) 


(7) 


By  assuming  that  the  clutter  scene  c(r),  x  €  K2  consists  of  a  set  of  random  point 
scatterers  (c(x)  =  J2k  uk$(x  ~  £*,))  with  Uk  iid, ,  zero  mean  and  with  covariance  a* I, 
the  summation  of  the  scatterers  within  a  resolution  cell  produces  to  complex  gaussian 
random  variable.  The  weights  described  in  equation  7  correspond  to  the  impulse 
response  of  the  system. 

The  mean  and  covariance  of  the  process  are 


£{r(x;p)}  =  0, 


(8) 


and 


S(a =  R(x,£\p,p') 

=  E{T(Sp)T'(£-,p')} 

■  MV)* 


(9) 


where  R(x,x!‘,p,pr)  is  the  correlation  function  of  T(x;p),  and  ac  is  the  variance  of' 
the  iid  complex  reflectivities  {u*}.  Note  that  the  covariance  function  £  is  completely 
specified  by  the  variance  crc  and  the  impulse  response  h(x). 
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The  choice  of  the  impulse  response  h  influences  the  statistics  of  the  SAR  image 
T(gr,p).  This  is  especially  true  when  examining  T(x;  p)  as  a  function  of  resolution. 
For  a  specific  spatial  location  x  =  x'  =  Xj  =  0,  the  covariance  in  resolution  becomes 


£*,(/>,/)  =  £(0,0;  p,p') 

=  —  I  h  (-+ h*  dy. 

PP J  \  P  }  \P  )  ~ 


(10) 


If  the  impulse  response  is  chosen  so  that  the  correlation  function  R  satisfies  a 
scaling  law  condition  [5]  with  respect  to  resolution,  then  the  process  T(x ;  p)  is  Gauss- 
Markov  in  resolution.  For  pi  <  p  <  pu  the  scaling  law  is 


D  ,  ,  Rxj(pi,p)Rxj(p,pu) 

= - - ' 


(11) 


where  Rxj(pi, Pu)  =  R(xo,  x^\ Pip')-  The  scaling  law  as  it  pertains  to  the  SAR  impulse 
response  becomes 


r  h  h*  dy  =  J_  rh  h*  dy  fh  (-£)  h 

J  \pi)  \pu)  -  \M2J  \pi  J  \  p  J  -J  \pj 


2-  \  U*  I 

Pu  , 


dy'. 

(12) 

where  ||/i||2  =  /|h  ( y )  \2dy. 

The  simplest  means  of  analyzing  equation  11  is  through  the  Fourier  transform  of 
the  system  impulse  response 


h(x)  =  J  H(f)ei2*^df. 


(13) 


The  scaling  law  can  now  be  written  in  terms  of  the  aperture  weighting  as 

/  w  wxr = (u, 

For  h(x)  —  rect(x)  or  H(f)  =  rect(f)  and  p  <  p' ,  the  correlation  of  the  process 
in  resolution  becomes 


R*Jp,p')  =  p'  [n(?£)H‘(f)df 

J  \  P  / 

=  y. 


(15) 
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Figure  3:  Pictoral  representation  of  a  rect  aperture  weighting  showing  that  the  scaling 
law  in  correlation  is  satisfied. 


A  simple  substitution  of  equation  15  into  equation  11  shows  that  the  scaling  law  is 
satisfied  when  the  system  impulse  response  is  either  sinc(x)  or  rect(x)  and  the  process 
is  Gauss-Markov.  This  is  simply  shown  pictorally  in  Figure  3.  The  correlation  of 
T(x_j]  p)  becomes  REj(p,p')  =  (T2c/max{p^p'}  with  variance  a 2  =  (rH  p. 

T(x]  p )  can  also  be  shown  to  have  independent  increments  in  resolution  -  i.e. 

£{[T(%;  p, )  -  Tfe;  ^)][T(£j;  p2)  -  T(xt ;  /*)]*}  =0  (16) 

for  pi  >  p2  >  P3,  •  In  this  case,  the  clutter  process  T(x_j]p )  is  a  Brownian  motion 
process  when  viewed  as  a  function  of  -p  [7]. 

The  Brownian  motion  nature  of  T  in  resolution  can  be  exploited  to  provide  a 
simple  linear  transformation  of  the  process  which  whitens  the  process  in  resolution. 
This  transformation  is  based  on  the  independent  increments  of  the  resolution  process 
and  the  scaling  of  the  variance  in  resolution.  Choose  a  set  of  resolutions  pi  <  ...  < 
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pi  <  pi+i  <  ...  <  pn  where  p,+i  =  pi  +  5p{.  An  increments  process  in  resolution  is 
formed  by 

T'fe; Pi)  =  pi  +  hi)  -  Tistfi  Pi)  (17) 

with  Spi  >  0.  This  process  has  zero  mean  and  is  independent  from  resolution  to 
resolution. 

Furthermore,  by  judicious  choice  of  resolutions  {p,},  the  variance  of  the  difference 
process  T'  can  be  made  constant  from  resolution  to  resolution.  We  choose  adjacent 
resolutions  to  satisfy  j-  —  =  7  for  every  i.  Under  this  latter  condition,  the 

resolution  step  size  Spi  can  be  easily  shown  to  be 

spi  =  (18) 

1  -  IPi 

This  resolution  sampling  strategy  is  shown  in  Figure  4  for  two  choices  of  7.  The  reso¬ 
lution  sampling  is  dense  at  fine  resolutions  and  becomes  sparse  at  coarse  resolutions. 
For  a  fixed  point  Xj,  we  define  the  vector  of  resolution  increments 

rfej)  =  {TXxj-,p1),...,T\!LiiPN-i)Y  (19) 

where  t  denotes  vector  transpose.  Then  T!(x.j)  has  distribution  T!(xj)  ~  Wc(0, 7^/) 
where  A 4(p,  S)  denotes  a  circular  complex  Gaussian  density  with  mean  p  and  complex 
covariance  S.  Here  the  symbol  ~  is  shorthand  notation  for  the  phrase  “has  probability 
distribution.”  We  also  define  the  vector 


T(xj)  =  {T(xj;p1),...,T(xj]pN)}t  (20) 

which  is  a  vector  of  samples  from  the  original  multiresolution  process  at  the  same  set 
of  resolutions  as  the  increments  process. 

Case  2:  Statistics  of  Cultural  Objects 

Many  man-made  or  cultural  objects  typically  consist  of  a  small  number  of  large 
amplitude  point  scatterers.  In  the  case  of  our  multiresolution  analysis,  the  physical 
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Change  in  resolution  increment  size  vs.  resolution 


Figure  4:  Sampling  strategy  for  the  choice  of  resolutions  to  whiten  the  clutter  incre¬ 
ments  process  in  resolution. 
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phenomena  we  wish  to  explore  is  the  interaction  of  a  small  number  of  local  prominent 
scatterers  and  the  interference  patterns  that  result.  The  number  of  local  prominent 
scatterers  we  wish  to  examine  are  typically  less  than  eight.  Larger  numbers  tend  to 
make  the  SAR  signature  exhibit  zero  mean  complex  Gaussian  statistics  [1]. 

We  will  reformulate  the  process  to  have  a  random  and  nonrandom  component. 
The  point  scatterer  model  will  become 

K  K' 

c(x)  =  £k) +  '52akS(z~  (21) 

k— 1  k=l 

The  first  term  in  Equation  21  will  correspond  to  the  clutter  model  as  outlined  in  Case 
1.  The  second  term  will  correspond  to  an  unknown  set  of  prominent  scatterers.  The 
complex  reflectivities  {a*,}  are  deterministic  but  unknown.  The  spatial  distribution 
of  the  prominent  scatterers  is  also  assumed  to  be  deterministic  but  unknown.  T(x;  p) 
will  be  multivariate  Gaussian  with  covariance  as  in  Case  1  (Eq.  10).  The  mean  of 
the  process  is 

E{n& ?)}  =  -£ «»*>  f §— •  (22) 

Pk=l  \  p  J 

In  the  same  spirit  as  Case  1,  we  form  a  vector  of  increments  in  resolution,  T!(xj)- 
This  process  is  complex  Gaussian  with  distribution  T!(xj)  ~  Afc{j±,'ycrll)  where  p  = 

and 

K‘ 

Pi  =  E{T  (xj;  pi )}  =  ak 

k- 1 


I  Pi  +  $  Pi  \P*  *f  $P 


s.)_IA(azs) 

'*/  Pi  \  Pi  J. 


(23) 
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3  Detection  Strategies 


The  goal  of  our  detection  scheme  is  to  construct  simple  pixel  screening  algorithms. 
Our  detection  strategies  will  exploit  information  provided  by  resolution.  This  infor¬ 
mation  is,  in  fact,  produced  by  the  local  spatial  signature  as  it  is  incorporated  into 
the  resolution  process.  We  derive  three  classes  of  multiresolution  tests.  The  first 
class  is  the  Generalized  Likelihood  ratio  test  (GLRT).  This  test  exploits  the  oscilla¬ 
tory  behavior  of  the  multiresolution  signature  due  to  interference  between  scatterers. 
We  use  an  autoregressive  (AR)  signal  model  to  chaxacerize  the  oscillatory  behavior. 
We  derive  the  general  multiresolution  GLRT  using  these  signal  models  and  a  set  of 
simplified  tests  based  on  special  cases  of  the  models.  The  second  class  of  tests  is 
composite  hypothesis  tests  whose  purpose  is  to  detect  differences  in  the  mean  value 
of  the  multiresolution  process.  As  shown  in  Section  2,  when  faced  with  a  determin¬ 
istic  but  unknown  set  of  scatterers  (cultural  object),  there  is  a  mean  value  on  the 
multiresolution  process  whereas  clutter  will  be  a  zero  mean  process.  Lastly,  we  will 
briefly  explore  the  construction  of  a  generalized  matched  filter  detector.  This  detector 
formulation  assumes  that  the  multiresolution  signature  is  known  a  •priori.  We  use  the 
generalized  matched  filter  as  a  means  of  deriving  a  resolution  sampling  strategy. 

3.1  GLRT  for  Multiresolution  AR  Processes 

With  the  GLRT  testing  strategy,  we  are  interested  in  testing  for  large  local  changes 
in  the  statistical  behavior  of  the  data.  Specifically,  our  comparison  is  between  the 
statistics  of  a  local  area  A*  presumed  to  contain  a  target  and  another  area  Xc  presumed 
to  be  clutter.  The  configuration  of  the  two  areas  under  test  is  shown  in  Figure  5. 
Due  to  the  oscillatory  nature  of  the  multiresolution  signatures  as  shown  in  Figure  1 
we  will  use  an  autoregressive  (AR)  model  for  the  multiresolution  signatures.  Each 
area  will  have  its  own  first  order  AR  process  model  in  resolution  described  as 

T(x_j\pi)  —  0,cT{^X_j]  Pi— l )  d*  :Lj  £  Xc 

T(xj] Pi)  =  atT(xj’,pi-i) +  ti  Xj  £  Xt 
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(24) 

(25) 


Figure  5:  Target  and  clutter  regions  which  are  tested  in  the  GLRT  for  similarity. 

where  c,-  and  i,-  are  zero  mean  Gaussian  random  variables  with  variances  of,  and  of 
respectively.  We  are  assuming  that  the  AR  processes  are  spatially  independent  and 
identically  distributed  over  each  local  area. 

We  wish  to  test  whether  the  inner  area  conforms  to  the  same  AR  process  as  the 
outer  clutter  area.  The  hypotheses  under  test  are 

Ho  :  ac  =  at,  o*  =  a] 

#1  :  ac  /  at,  or  a2c  ±  o] 

Our  approach  is  to  construct  a  GLRT  to  test  for  the  difference  between  the  two 
areas.  Let  us  define  a  data  vector  Tj,  l  =  t,c  which  incorporates  the  data  in  each 
local  area  as 


Tj  —  pi), . . .  pn),T(x2',  pi), . . .  pn), 
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•  ••iT’UjiiPi) . r(£j(;/w)}(  Xj  €  A’i,  l  =  t,c. 


(26) 


Our  test  will  use  N  resolutions,  Jc  pixels  in  area  Xc  and  Jt  pixels  in  area  Xf  The 
joint  distribution  for  the  spatial/multiresolution  process  in  X\  is 


Piffl 


(  1  \NJl 

YLx-^Xi  YliLi  | T(x_j;pi)  —  aiT(x_j ;  p»-i)|2 

Xy/Zirat) 

2a? 

• exp 


J2x-  eXi  |T(^;  po)P 


2a? 


I  =  t,c. 


(27) 


The  GLRT  is  defined  as  [4] 


A(X)  = 


maxgt  pi (T\di)  h, 
max0o  po(T|0o) 


(28) 


where  T_  is  the  entire  data  set  encompassing  both  regions.  0*  =  [a*,  a?]*,  k  =  0, 1  are 
the  unknown  parameters  in  our  test  for  each  hypothesis.  The  extrema  of  this  test 
are  satisfied  using  the  maximum  likelihood  estimates  of  the  parameters  conditioned 
on  the  hypotheses  [4].  We  will  denote  the  parameter  vector  corresponding  to  these 
estimates  as  Ok  =  [a*,  a?]*,  k  =  0, 1. 

To  evaluate  the  GLRT,  we  will  assume  that  the  data  in  the  two  regions  are  in¬ 
dependent.  Recall  that  under  hypothesis  Hi  the  parameters  of  the  AR  processes 
are  assumed  to  be  different.  Using  the  assumption  that  the  statistics  of  Xc  and  Xt 
are  independent,  the  extrema  of  the  joint  conditional  distribution  pi(T\6i)  can  be 

A  A 

decomposed  as  maxo,  pi(T|0i)  =  p{T-c\0c)p(lLt\Qt)-  The  parameters  for  each  area  are 
estimated  separately.  Let  0\  =  [a^a?}\  l  =  t,c  be  the  parameter  vectors  corre¬ 
sponding  to  the  maximum  likelihood  estimates  based  solely  on  the  areas  X\.  These 
parameter  estimates  are  given  as 


A  _  ^=i  T  {x_j ;  p,- )  T**  (gj  x ) 

=  jjr  E  l  =  t,c  (30) 

iyJlxiexli=i 
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Under  hypothesis  Ho ,  the  AR  models  are  assumed  to  be  identical.  Consequently, 
the  unknown  coefficients  axe  estimated  using  both  regions  Xc  and  Xt.  Again,  using 
the  assumption  that  the  statistics  of  Xc  and  Xt  are  independent,  the  extrema  of 
the  joint  conditional  distribution  po{T\9o)  can  be  decomposed  as  maxg0  po(T\6o)  = 
p(2^\9o)p(Tt\60).  Let  60  =  [do, <7$*,  and  Let  X  =  Xc U  Xt.  These  estimates  are 


a0  = 


an  = 


Yjx-^x  S;=i  T(ft;  pi)T*(x_j ;  pi-i) 

-rfj-j  E  E  |x(%irt)-aiT(*j;«-i) 


(31) 

(32) 


For  the  case  where  many  more  clutter  pixels  are  used  than  target  Jc»  Jt ,  do  —  dc, 

A  A 

~  <r*,  and  0O  ~  The  GLRT  can  then  be  approximated  as 


A(X)  = 


maxe,  pi(T|0i)  _  p(Tcl^MXtl^t)  ^  ?(£#*)  (A 


max0o  Po(T\6q)  p(T^\9o)p(Tt\d0)  p(Tt\0c) 


>P. 


(33) 


The  log  GLRT  is  then  given  as 


4>(T)  =  logA(I) 


NJ, (log  (\/2irUc)  -  log  (Vtorvt)) 

X)  Z  Pi)  ~  &cT(Xj]  pi-i) |2  >  /?  (34) 


9A2 

XjEXt  *=1 


To  help  mitigate  the  assumption  of  statistical  homogeniety  over  the  target  area 
we  will  only  label  the  center  pixel  of  the  area  Xt  with  the  outcome  of  the  test  and 
not  the  entire  target  area  which  would  be  standard.  This  creates  a  pixel-by-pixel  test 
where  the  local  target  area  is  used  to  help  provide  averaging  in  the  estimation  of  the 
AR  coefficient  and  variance. 


3.1.1  Special  Cases 

The  special  cases  that  we  examine  are  motivated  by  models  for  the  process  statistics 
of  the  clutter.  For  natural  terrain,  a  typical  assumption  is  that  each  resolution  cell 
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in  the  SAR  image  contains  a  large  number  of  small  amplitude  scatterers.  We  will 
assume  that  the  complex  reflectivities  of  the  scatterers  are  independent,  identically 
distributed  (iid)  random  variables  with  mean  zero  and  covariance  of/  where  I  is  the 
identity  matrix. 


Special  Case  1:  AR  vs.  Brownian  motion 


This  clutter  model  provides  a  motivation  to  test  between  an  AR  process  indicative 
of  a  target  vs.  Brownian  motion  clutter.  Brownian  motion  is  simply  an  AR  process 
where  ac  =  1  [25].  The  GLRT  then  simplifies  to  the  test 


uz)  = 


NJt( log  (\/27T<7c)  -  log  (V2ncrt)) 


N 

EE 


(35) 


We  can  create  a  simpler  test  by  assuming  of  =  of.  If  we  also  examine  the  case 
where  Xt  =  Xj  (a  single  target  pixel),  a  Pearson  correlation  test  can  be  constructed 
as 


l/j  (T(r  ii  111  —  />»+l)|  Hj 

MJXstj))  =  Kl - - <  <3' 


(36) 


The  heuristic  used  here  is  that  the  AR  coefficients  of  clutter  will  be  close  to  1  due 
to  the  Brownian  motion  characteristics  while  the  target  AR  coefficients  would  be  less 
than  1. 


Special  Case  2:  AR  vs.  White  noise  process 

The  statistical  model  derived  in  Section  2  for  the  multiresolution  process  showed 
that  a  resolution  increments  process  using  a  specific  set  of  resolutions  is  a  white  noise 
process.  Using  this  increments  process  Tf  a  test  between  an  AR  process  in  Xt  and 
white  noise  ( a'c  =  0)  in  Xc  can  be  devised.  The  increments  process  for  cultural  objects 
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remains  an  AR  process.  However,  the  driving  noise  for  the  AR  increments  process 
for  the  target  is  now  correlated.  For  the  sake  of  simplicity,  we  will  approximate  the 
increments  process  for  the  target  as  AR  with  an  iid  noise  source.  In  the  derivation 
of  this  detector,  a  simple  replacement  of  the  original  process  T  by  T'  is  made.  The 
multiresolution  increments  process  can  be  described  as 

T'{xf,Pi)  =  a'cT'(Xj\  pi-\)  +  c'i  xj  €  (37) 

T\xy,Pi)  =  a'tT\xj;pi-l)  +  t,i  ^  e  Xt  (38) 

where  c\  and  t[  are  zero  mean  Gaussian  random  variables  with  variances  of,  and  of, 
respectively.  We  are  testing  between 

H0  :  o'  =  a't  =  0,  of  =  of, 

Hi  :  a't  ±  a!c  =  0,  or  a],  ±  of 


The  GLRT  simply  becomes  an  estimate  of  the  AR  coefficient  in  region  Xt  as 

>  fi  (39) 


MX)  =  hi  = - t ^ 

Z,x  ext  Z,»=i  1 7  (xj ,  pi -i )  | 


The  processor  can  be  further  simplified  when  Xc  —  Xt  =  Xj.  This  is  a  single  pixel 
detector.  The  GLRT  becomes  the  Pearson  correlation  test  in  resolution 


I  (rp>(r  \\  _  I  Z),-=l  T  (Xj-,pi)T  (Xj-,Pi+ 1)|  Hi 

*aw)  -  — ew — >  p 


(40) 


A  related  heuristic  test  can  be  constructed  which  exploits  the  notion  that  the 
clutter  increments  process  is  uncorrelated.  However,  when  a  cultural  object  is  present, 
there  will  be  a  nonzero  correlation  coefficient  due  to  the  presence  of  a  mean  value. 
This  test  is  an  unnormalized  autocorrelation  test  [18] 


«£(*,))  =  4  E  >  fi. 


(41) 


'c  «=1 
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3.2  Composite  Tests 


We  will  base  our  detection  strategies  on  the  increments  process  T'(xq-,  p).  The  cultural 
object  signal  model  motivates  detectors  which  exploit  the  difference  in  the  mean  when 
either  clutter  or  cultural  objects  are  present.  We  will  use  the  same  convention  as 
Section  3.1  where  we  have  a  target  area  Xt  and  an  assumed  clutter  area  Xc.  For  the 
cases  examined  here  Xt  =  Xj,  we  have  a  pixel-by-pixel  test.  The  hypotheses  under 
test  will  be 


Ho  :  T'ixj)  ~  Afc(jx,  ),  /£  =  0 

Hi  :  T!{xj)  ~  A/’c(/£,S£i),  £  7^  0 


with  =  7 a2 1.  We  also  have  available  F(xj),  Xj  €  Xc  which  are  surrounding 
locations  assumed  to  be  clutter. 

We  have  chosen  the  composite  test  due  to  the  severe  variability  of  SAR  signatures 
when  collection  geometry  and  object  condition  are  not  known.  This  is  opposed  to 
choosing  a  specific  object  signature  and  designing  a  matched  filter  to  it.  There  does 
not  exist  a  uniformly  most  powerful  test  for  the  above  composite  hypothesis.  The 
optimal  invariant  test  with  respect  to  scale  and  orthogonal  transformations  is  the 
so-called  F  test  [17]  which  is  equivalent  to 


MX  M  = 


N  - 1 


y  ir'fe^QP  «, 


(42) 


where  a2c  is  the  estimate  of  cr*  from  surrounding  cells  assuming  no  target  is  present, 
i.e., 

r!L .v  \T'(x;\  o;)\2 

(43) 


O-c  = 


M(N  —  1) 


The  distribution  of  the  test  statistic  is  ip2(x.j)  ~  F(2(N  —  1), 2 M(N  —  1))  under 
Ho,  and  ~  F(2(N  —  1),  2M(N  —  1);  h^l))  under  Hi.  Here  F(m,  n) 

is  the  central  F  distribution  with  m  and  n  degrees  of  freedom,  and  F(m,n\r 7)  is 
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the  noncentral  F  distribution  with  the  same  degrees  of  freedom  and  noncentricity 
parameter  rj.  For  the  case  of  large  M(N  —  1),  the  estimate  of  <j\  is  almost  exact.  The 
distributions  become  a  scaled  central  chi-square  under  i/o  and  a  scaled  non-central 
chi-square  under  Hi. 

A  number  of  other  detection  strategies  are  often  used  to  screen  data.  Our  baseline 
test  with  which  we  compare  the  multiresolution  test  performance  is  a  pixel-by-pixel 
F  test  (Xt  =  x_j)  applied  to  single  resolution  SAR  image  data  at  the  finest  resolution. 
This  test  is 


\T{X^P1)\2  Hy 


>P 


(44) 


This  test  has  been  used  extensively  in  initial  screening  algorithms  on  SAR  data[22j. 
This  test  searches  for  bright  signatures  in  relation  to  the  surrounding  clutter.  This 
test  is  the  optimal  invariant  test  with  respect  to  scale  and  orthogonal  transformations. 
This  test  can  also  be  generalized  to  a  multipixel  test  as 


MI.)  = 


E£ie^|71(xj;p1)|2  H, 

*9  ^  P' 


(45) 


3.3  Generalized  Matched  Filter 


The  problem  we  consider  is  that  of  detecting  a  known  target  at  a  given  spatial  loca¬ 
tion  Xq  £  IR2  embedded  in  zero-mean,  complex-valued  Gaussian  clutter.  Using  the 
development  of  Section  2,  we  can  model  the  SAR  image  under  the  target  present 
hypothesis  as: 


Hi  :  T(x]p)  =  S(x  -  Xq;p)  +  C(x)p). 

Here,  we  assume  that  C  is  a  stationary  complex-valued  Gaussian  process  that  models 
ground  clutter,  while  5  is  a  known  coherent  target  signature.  The  target  absent 
hypothesis,  H0 ,  is: 


H0  :  T(x;  p)  =  C(x\  p). 
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Note  that  for  simplicity  we  have  ignored  receiver  noise  and  the  fact  that  in  many 
cases  the  target  typically  occludes  the  ground  clutter. 

Single-Resolution  Detector.  In  the  single-resolution  case ,  po,  the  likelihood  ra¬ 
tio  test  is  equivalent  to  computing  the  following  statistic  ^  and  comparing  it  to  a 
threshold  [26],  [28]: 

0  =  J  T(x]  p0)f(x  -  xo )dx  >  (3.  (46) 

Here,  /  is  a  solution  to  the  integral  equation 

J  y)S(m  po)dy  =  /(x),  (47) 

and  S  is  the  covariance  function  for  C .  The  detection  problem  is  nonsingular  for  a 
given  signal  S  if  the  integral  equation  (47)  has  a  solution  [28].  When  C  is  a  white 
process,  the  solution  to  (47)  is  /  =  S*,  and  (46)  becomes  a  matched  filter  detector. 

Multiresolution  Detector.  Suppose,  however,  that  we  are  allowed  to  use  the  SAR 
image  T  at  multiple  resolutions.  The  multiresolution  likelihood  ratio  test  has  a  form 
similar  to  the  single-resolution  case: 

^  =  J  J  T(x;p)f(x- XQ]p)dxdp.  (48) 

where  now,  /  is  a  solution  to  the  equation, 

J  £(x,  P\  y,  p')S(y ;  p')dydp'  =  /(x;  p),  (49) 

and  X  is  the  covariance  of  C(x;  p). 

Figure  6  shows  receiver  operating  characteristics  for  the  single  and  multiresolution 
detectors  for  the  signal  s  shown  in  Figure  1.  In  computing  the  curves  for  Figure  6,  h 
was  a  sine  function  and  we  imposed  the  constraint  that  the  finest  available  resolution 
p  is  equal  to  one  fourth  of  the  spacing  between  adjacent  scatterers  ( p  >  D/4).  One 
can  see  that  the  multiresolution  setting  admits  a  significant  performance  gain  over 
the  single-resolution  case. 
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Practical  Considerations.  In  comparison  to  the  single  resolution  detector,  Equa¬ 
tion  (46),  the  multiresolution  detector  in  Equation  (48)  requires  evaluation  of  a  double 
integral  involving  a  continuum  of  resolutions.  Practical  considerations  may  lead  one 
to  consider  using  only  a  finite  number  of  resolutions,  pi, .. .  ,pn  and  replace  the  statis¬ 
tic  T  in  Equation  (48)  by  some  approximating  statistic,  say  xf>„. 

A  simple  approach  to  approximating  xf)  would  be  to  replace  the  continuously- 
indexed  function  s(x;  p )  by  its  (discretized)  wavelet  or  wavelet  packet  decomposition 
[27]  and  solve  the  corresponding  integral  equation  for,  say  /»(;£#;  2“fc),  and  use  /„  in 
place  of  /  in  (48).  From  a  practical  standpoint,  wavelets  and  wavelet  packets  have  the 
advantage  that  they  axe  computationally  very  efficient.  However,  they  are  typically 
limited  to  dyadic  (triadic  or  similar)  resolutions  (i.e.,  integer  powers  of  2  or  3)  and 
are  not  necessarily  the  optimal  choices  of  resolution. 

As  an  alternative,  we  applied  a  sampling  scheme  described  in  [26]  to  solve  the 
problem  of  optimally  approximating  the  test  statistic  using  a  finite  number  of  res¬ 
olutions.  This  latter  sampling  scheme  requires  that  one  choose  resolutions  px,...,pn 
to  be  the  n  quantiles  of  the  function  A (p)  =  [7(p)||s(r; p)!!2]1^3,  where  ft  is  a  function 
depending  on  S,  the  covariance  of  C(x;  p),  and  ||  •  ||  is  the  Li  norm.  Details  of  this 
approach  are  described  in  the  appendices. 

Figure  6  additionally  shows  the  performance  of  two  multiresolution-based  detec¬ 
tors  that  use  a  finite  set  of  resolutions.  The  wavelet-based  resolution  decomposition 
uses  the  dyadic  resolutions  {D/4,  D/2,  D,  2D},  while  the  optimal  multiresolution  sam¬ 
pling  scheme  for  n  =  4  uses  the  resolutions  {0.25D,  0.35D,  0.525D,1.5D}. 

Figure  6  shows  that  the  optimal  resolution  sampling  scheme  performed  much 
better  than  the  wavelet  sampling  scheme.  Finally,  a  fifth  curve  in  Figure  6  shows 
that  optimum  detector  performance  is  achieved  for  large  enough  n. 
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4  STAR  Algorithm  Description 

4.1  Introduction 

This  section  reviews  the  algorithm  developed  under  the  Strategic  Target  Algorithm 
Research  (STAR)  program  executed  for  Lincoln  Laboratories.  This  algorithm  detects 
extended  targets  in  clutter.  It  served  as  a  basis  for  insertion  of  our  multiresolution 
detection  strategy  and  as  a  means  of  performance  comparison.  Though  originally 
developed  for  polarimetric  data,  the  ERIM  STAR  algorithm  is  general  in  the  sense 
that  it  accepts  any  multi-channel  data  set  such  as  multiresolution  data.  We  used  this 
algorithm  with  both  single  resolution  and  multiresolution  data. 

The  ERIM  STAR  detection  algorithm  exploits  statistical  knowledge  of  local  clut¬ 
ter.  Figure  7  illustrates  the  algorithm  logical  structure,  which  consists  of  five  compo¬ 
nents:  contrast-enhancement /speckle-reduction,  CFAR  screening,  aggregation,  fea¬ 
ture  extraction,  and  assignment  of  a  cue  rating  factor  (CRF,  aka  decision  statistic). 
ERIM’s  detection  algorithms  assign  CRFs  to  aggregate  clusters  of  pixels  (blobs). 
High  CRFs  indicate  high  confidence  that  a  blob  is  a  target-of-interest,  while  low 
CRFs  indicate  low  confidence  that  a  blob  is  a  target-of-interest.  By  definition,  the 
targets-of-interest  are  the  large  and  medium  vehicles;  all  other  man-made  and  natural 
objects  are  uninteresting.  Reports  of  the  blob  locations,  together  with  their  CRFs 
pass  on  to  scoring  software,  which  evaluates  ROC  curves  by  thresholding  the  CRFs 
and  counting  ground-truthed  detections  and  false  alarms. 

Contrast-enhancement /speckle-reduction  (CESR),  is  a  pixel-by-pixel  operation 
that  exploits  second-order  statistical  properties  of  multi-channel  clutter  to  produce 
an  enhanced  real- valued  image  from  the  complex- valued  multi-channel  measurements. 
By  simultaneously  increasing  the  contrast  between  targets  and  surrounding  clutter 
and  minimizing  clutter  speckle  (without  compromising  resolution),  CESR  improves 
the  capability  of  CFAR  screening  to  highlight  target  pixels  and  reject  false  alarm  pix¬ 
els.  Empirical  observation  indicates  that  the  upper  (but  not  lower)  tails  of  the  clutter 
distributions  of  the  CESR  image  can  be  conservatively  (in  terms  of  false  alarm  rate) 
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Figure  7:  Original  STAR  multi-channel  contrast-based  detection  algorithm  flow. 

modelled  as  log-normal.  Thus  we  evaluate  a  CFAR  statistic  which  measures  the  local 
contrast  (in  local  standard  deviations  from  the  local  mean)  of  the  log  CESR  image. 
We  employ  a  comparatively  low  screening  threshold  that,  based  on  log-normal  theory, 
predicts  a  false  alarm  rate  of  103  FA /km2.  This  threshold  provides  at  least  one  CFAR 
hit  on  98%  of  the  deployed  large  and  medium  training  targets. 

Obscured  targets  generally  produce  multiple  disconnected  CFAR  hits  in  close 
proximity  to  one  another.  Thus,  we  employ  a  cascade  of  dilation  and  skeletonization 
operators  to  aggregate  nearby  (on  a  target-sized  scale)  CFAR  hits  into  connected, 
multi-pixel  (spatially  distributed)  blobs.  Ideally,  the  size  and  shape  of  target-induced 
blobs  reflect  the  size  and  shape  of  the  underlying  targets;  obscuration  generally  re¬ 
duces  the  fidelity  of  the  blob  size  and  shape.  False  alarm  rates  based  on  counting 
nearby  CFAR  hits  that  are  merged  into  aggregate  blobs,  each  characterized  by  their 
maximum  CFAR  statistic,  are  an  order  of  magnitude  lower,  typically,  than  those 
based  on  counting  individual-pixel  CFAR  hits. 

The  detection  algorithm  assigns  the  maximum  Value 'of  the  CFAR  statistic  over 
the  pixels  comprising  the  blob.  Our  detection  algorithm  combines  the  notion  of  pixel- 
by-pixel  CFAR  with  aggregation.  The  detector  exploits  only  the  maximum  available 
contrast  against  local  clutter  as  its  measure  of  “targetness”. 

The  discriminator  exploits  more  characteristics,  or  features,  of  interesting  targets 
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than  does  the  detector.  Consequently,  the  discriminator  does  a  better  job  of  rejecting 
culturally-induced  false  alarms,  such  as  buildings,  the  small  target,  uncamouflaged 
vehicles  on  roads,  trihedrals,  sensor  artifacts,  etc.,  than  the  detector.  In  addition, 
the  discriminator  does  a  better  job  of  rejecting  clutter-induced  false  alarms  than  the 
detector. 

The  multiresolution  detection  strategies  effectively  replace  the  front  end  of  the 
STAR  algorithm  which  is  concerned  with  the  pixel-by-pixel  detection  operation.  This 
is  shown  in  Figure  8.  Our  conjecture  is  that  using  detection  strategies  based  on  mul¬ 
tiresolution  signatures,  enhanced  target/clutter  contrast  will  be  obtained.  As  a  means 
of  testing  that  conjecture,  the  threshold  values  remained  constant  when  single  res¬ 
olution  and  multiresolution  data  was  used.  Another  conjecture  was  that  a  higher 
percentage  of  pixels  will  be  detected  over  the  extended  target  using  multiresolution 
data.  This  conjecture  would  imply  that  a  redesign  of  the  post  pixel-by-pixel  detection 
morphology  would  be  in  order.  The  morphology  would  be  made  less  aggressive.  We 
decided,  however,  to  leave  the  morphology  unaltered  since  we  found  few  instances  of 
multiple  targets  merging  together.  Lastly,  we  replaced  the  discriminant  module  of 
the  STAR  algorithm  which  is  based  on  a  quadratic  discriminator  by  a  tree  structured 
classifier.  The  quadratic  discriminator  is  fundamentally  based  on  the  joint  Gaussian 
nature  of  the  feature  vector.  The  tree  structured  classifier  does  not  make  this  as¬ 
sumption  which  we  felt  was  more  realistic  with  respect  to  the  statistics  of  the  derived 
features. 

4.2  Algorithm  Description 

4.2.1  Contrast  Enhancement 

Poor  contrast  between  targets  and  background  clutter,  together  with  speckle,  are  ma¬ 
jor  factors  which  limit  performance  of  automated  SAR  image-based  target  detection 
algorithms.  Contrast-enhancement / speckle-reduction  (CESR)  techniques  exploit  the 
multi-channel  data  available  at  each  pixel  to  maximize  contrast  and  minimize  speckle 
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Figure  8:  Algorithm  flow  of  multiresolution/STAR  detection  algorithm. 


while  preserving  resolution.  A  vector  T  of  complex-valued  data  is  associated  with 
each  pixel  in  a  multi-resolution  SAR  image.  Conversely,  we  can  also  use  the  mul¬ 
tiresolution  increments  process  T!_.  CESR  techniques  combine  these  measurements 
into  a  single  real  value  that  exhibits  maximum  contrast  with  respect  to  background 
clutter,  and  which  represents  an  optimum  estimate  of  the  intensity  of  a  Gaussian 
speckle-modulated  field.  The  CESR  process  evaluates  a  quadratic  form  involving  the 
data  vector  at  each  pixel  and  the  inverse  of  the  typical  clutter  covariance  matrix. 
Both  contrast-enhancement  and  speckle-reduction  improve  performance  of  the  initial 
CFAR  detection  stage.  Improving  the  target-to-clutter  ratio  effectively  increases  the 
separation  between  the  image  target  and  clutter  means,  while  reducing  speckle  de¬ 
creases  the  standard  deviation  of  the  clutter  (and  maybe  the  target).  Here  we  provide 
the  contrast-enhancement  motivation  for  the  CESR  processor;  Novak  [1]  provides  a 
speckle-reduction  motivation. 

Consider,  at  each  pixel  Xj,  a  Unear  combination  of  the  data  available  at  that  pixel: 
gt(xj)T_(x_j)',  g(x.j)  is  a  vector  of  spatially-adaptive  filter  coefficients.  The  contrast 
ratio  (target-to-local-clutter  ratio)  at  the  output  of  this  filter  is: 

|fft(g.j)r(xi)|2 
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The  Ex.€xc()  operator  denotes  expectation  over  an  annular  twice- target-sized  spatial 
region  Xc  centered  on  pixel  Xj.  (H(xj )T! (xj ) )  denotes  the  local  clutter 

covariance  matrix.  A  simple  eigenvalue  argument  establishes  the  optimum  contrast 
enhancement  filter  as  g(xj)  =  S“1T(£J)/||E“^2T(r_?)|.  The  maximized  contrast 
produced  by  this  data-dependent  filter  is  T!(xj)'£'^T(xj).  The  contrast-enhanced 
image,  in  amplitude  units,  is 

idfo)  =  JtWSJTtej.  (51) 

Covariance  matrices  for  a  wide  variety  of  natural  clutter,  evaluated  over  twice- 
target-sized  averaging  windows,  differ  principally  in  terms  of  their  scaling  level.  The 
relative  levels  of  the  covariance  matrix  components  vary  only  slightly  with  terrain 
type.  Thus,  we  assume  that  the  local  clutter  covariance  is  of  the  product-model 
form  =  cr2(xj)fjx.,  where  is  normalized  so  that  its  span  is  one.  Thus  the 
maximized  local  contrast  is 


(--'max  {x_j  )  — 


(52) 


To  preserve  image  level,  we  multiply  the  contrast  by  the  local  clutter  cross-section 
<r2(xj).  Thus  the  contrast-enhanced  image,  in  amplitude  units,  is 


i0B(m )  =  y'gteis-'lfe). 


(53) 


If  we  use  the  covariance  matrix  for  the  increments  process  derived  in  Section  2  = 

7),  the  contrast  image  simply  becomes 


*Ce(Xj)  =  ||Ifc)| 


(54) 


which  is  the  F  test  described  in  Section  3. 
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4.2.2  CFAR  Screening 


Constant  false  alarm  rate  (CFAR)  screening  is  a  standard  technique  for  isolating 
pixels  that  appear  statistically  different  than  the  surrounding  pixels  in  an  inhomoge¬ 
neous  gray-scale  image.  In  our  application,  we  seek  to  isolate  those  pixels  that  exhibit 
positive  contrast  with  respect  to  the  local  clutter  in  excess  of  what  one  would  antic¬ 
ipate,  with  specified  probability,  from  locally  homogeneous  clutter.  Our  detection 
and  discrimination  algorithms  use  these  “hits”  to  trigger  the  aggregation  and  feature 
extraction. 

To  establish  ball-park  pixel-by-pixel  screening  false  alarm  rates,  it  is  conservative 
and  convenient  to  model  the  upper  (but  not  lower)  tails  of  locally  homogeneous  log 
CESR  clutter  as  Gaussian.  That  is,  if  we  measure  the  local  mean  and  standard  devi¬ 
ation  of  homogeneous  log  CESR  clutter,  the  associated  Gaussian  density  lies  slightly 
above  the  upper  tail  of  the  clutter  histogram.  Based  on  this  model,  the  screener  eval¬ 
uates  the  two-parameter  CFAR  statistic  When  the  Gaussian  log-CESR 

model  is  accurate,  the  CFAR  statistic  is  a  unit  variance,  zero  mean  Gaussian  random 
variable.  Thresholding  the  CFAR  statistic  (local  contrast)  at  threshold  /?  leads  to  a 
per-pixel  false  alarm  probability  of  Pfa  —  ^erfc(^=).  The  corresponding  false  alarm 
rate  (FA /km2),  assuming  M  independent  pixels  per  square  kilometer,  is 

FAR  =  y  erfc(-^).  (55) 

For  resampled  spotlight  and  stripmap  ADTS  data,  M  =  16  x  10 6 /km2.  Thus,  our 
CFAR  screening  threshold  of  3.84  corresponds  to  103  FA/km 2,  based  on  log-normal 
theory.  We  also  use  a  CFAR  threshold  of  2.50  to  determine  the  prominent  scatterers 
on  an  aggregate  blob;  this  corresponds  to  105  FA/km2. 

The  CFAR  screening  algorithm  uses  a  rectangular  annular  window,  centered  on 
the  test  pixel,  to  define  the  clutter  surrounding  the  test  pixel,  and  evaluate  its  mean 
and  standard  deviation.  The  window  is  one  pixel  thick,  with  a  half-width  equal  to  the 
length  of  a  large  target  (80  pixels).  The  window  size  ensures  that  pixels  on  one  end 
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of  the  target  will  not  corrupt  clutter  statistics  when  the  test  pixel  is  at  the  other  end 
of  the  target.  Subject  to  this  constraint,  the  window  incorporates  the  pixels  nearest 
to  the  test  pixel,  which  are  most  representative  of  local  clutter. 

4.2.3  Aggregation 

The  goal  of  aggregation  is  to  consolidate  nearby  disjoint  hits  from  CFAR  screening 
into  connected  blobs  which  contain  the  prominent  scattering  centers  together  with 
a  minimal  number  of  surrounding  pixels.  To  accomplish  this  goal,  we  use  two  mor¬ 
phological  operators:  dilation  and  skeletonization.  Dilation  grows  disjoint  regions  to¬ 
gether,  while  skeletonization  eliminates  many  of  the  non-prominent  pixels  introduced 
by  dilation  yet  preserves  component  connectivity.  The  effect  of  our  aggregation  oper¬ 
ations  is  to  produce  a  blob  which  resembles  a  hull  around  the  CFAR  hits  that  extends 
roughly  seven  pixels  around  those  hits. 

Dilation  evaluates  the  local  maximum  of  a  gray-scale  image  over  a  binary-valued 
sliding  structuring  element,  which  is  analagous  to  the  impulse  response  support  of  a 
linear  filter.  The  STAR  aggregation  algorithm  applies  a  5  x  5  octagonal  structuring 
element  to  the  binary-valued  hits  produced  by  initial  CFAR  screening.  Each  dilation 
effectively  increases  the  diameter  of  a  CFAR  hit  region  by  four  pixels.  Disjoint  CFAR 
hits  separated  by  less  than  four  pixels  grow  together  into  a  merged  region.  We 
perform  a  cascade  of  seven  dilations,  thereby  merging  (aggregating)  CFAR  hits  that 
are  separated  by  less  than  28  pixels  (35%  the  length  of  a  large  target)  into  a  single 
blob. 

While  dilation  does  merge  prominent  scatterers  into  aggregate  blobs,  the  resulting 
blobs  contain  many  surrounding  non-target  and/or  non-prominent  pixels.  To  elim¬ 
inate  these,  one  could  use  an  erosion  operator,  which  evaluates  the  local  minimum 
over  a  sliding  structuring  element.  Unfortunately,  erosion  is  the  inverse  of  dilation, 
and  could  re-fragment  the  blob.  Instead,  for  binary-valued  images,  a  skeletonization 
operator  exists,  which  is  analagous  to  erosion,  but  contains  a  test  to  avoid  fragment 
tation.  We  utilize  an  8- way  connectivity  skeletonization  operator,  which  fs  analagous 
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to  an  erosion  that  uses  a  3  x  3  square  structuring  element.  We  perform  a  cascade  of 
seven  8- way  skeletonizations  after  the  seven  dilations. 

A  cascade  of  dilations  followed  by  a  cascade  of  skeletonizations  is  an  easy  way  to 
merge  nearby  CFAR  hits  into  aggregate  blobs.  We  chose  the  number  of  operators 
in  the  cascade,  seven,  empirically  on  single  resolution  data.  Had  we  chosen  a  very 
low  CFAR  screening  threshold,  it  would  have  been  necessary  to  employ  less  aggressive 
aggregation  to  avoid  merging  the  entire  scene,  or  closely-spaced  targets  due  to  the  high 
false  alarm  rates.  Conversely,  had  we  chosen  a  very  high  CFAR  screening  threshold, 
it  might  have  been  necessary  to  employ  more  aggressive  aggregation  to  connect  more 
widely-spaced  target  components. 
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5  Evaluation  of  Multiresolution  CFAR  Performance 

This  section  will  outline  the  empirical  evaluation  of  the  various  detection  strategies 
using  multiresolution  signatures.  We  examine  two  cases,  synthetically  generated  tar¬ 
get  and  clutter  signatures,  and  the  use  of  actual  collected  SAR  data  of  targets  and 
terrain  through  the  STAR  data  set  augmented  by  ERIM  DCS  data.  The  simulated 
target  signature  is  a  SAR  simulation  of  a  Howitzer  target  produced  by  the  Synthetic 
Radar  Image  Model  (SRIM)  software.  The  synthetic  background  clutter  is  a  homoge¬ 
neous  complex  Gaussian  random  process.  The  collected  data  incorporated  collected 
targets  in  various  netted  states  and  deployments,  natural  homogeneous  and  inhomo¬ 
geneous  clutter,  and  cultural  clutter.  Section  5.1  discusses  the  methodology  by  which 
we  generated  the  synthetic  scenes.  Section  5.2  provides  results  and  trade  studies  of 
the  multiresolution  detection  strategies  when  applied  to  synthetic  scenes.  Section  5.3 
explains  the  results  obtained  when  the  multiresolution  detection  strategies  were  ap¬ 
plied  to  the  collected  SAR  data  sets.  Finally,  section  5.4  explains  the  discrimination 
algorithm  that  we  used  in  these  studies  and  the  results  that  were  obtained. 

5.1  Synthetic  Scene  Generation 

The  images  used  for  synthetic  test  data  were  created  by  embedding  18  synthetic 
radar  image  model  (SRIM)  generated  targets  into  a  simulated  homogeneous  clutter 
background.  The  targets  were  32x32  pixels  in  size  at  the  finest  resolution.  The  targets 
were  multiple  realizations  of  a  towed  artillery  piece.  Each  realization  of  the  target 
was  generated  from  different  aspect  angles  which  ranged  from  0  to  340  degrees  in 
increments  of  20  degrees.  The  clutter  was  generated  as  a  512x512  random  field.  The 
target-to-clutter  ratio  was  OdB.  The  resolution  of  this  simulation  was  1  foot.  This 
data  set  is  shown  in  Figure  9.  An  additional  512x512  clutter  field  without  embedded 
targets  was  also  created  for  false  alarm  evaluation.  The  probability  distribution  of 
the  random  field  was  circular  complex  gaussian  (p(xo,  Vo)  ~  A/^O,  cr2clut))  with  unit 
variance  (cr^lui:  =  1).  The  random  process  was  independent  from  pixel  to  pixel  at  the 
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finest  resolution. 

Targets  were  embedded  into  the  clutter  by  creating  a  mask  defining  the  target  ex¬ 
tent  for  each  realization,  then  substituting  the  target  pixels  for  the  clutter  pixels  over 
the  target  extent.  The  target  extent  was  defined  by  thresholding  the  target  signature 
at  a  level  lOdB  below  the  average  target  power.  Target  pixels  which  fell  below  the 
lOdB  threshold  were  eliminated.  Target  pixels  which  were  below  the  threshold  but 
were  wholly  contained  within  the  target  support  were  allowed,  however.  This  con¬ 
tiguous  pixel  set  defines  the  support  (pixel  set)  of  the  target.  For  our  discussion,  we 
will  call  this  pixel  set  Xtgu  The  target  pixels  contained  in  the  mask  were  then  scaled 
to  unit  average  power.  Areas  of  the  512x512  homogeneous  clutter  field  conforming  to 
specific  target  masks  were  extracted.  Target  signatures  were  then  inserted  into  the 
extracted  areas  replacing  the  clutter  signatures  by  the  embedded  target  signatures. 

The  dynamic  range  of  the  target  was  also  constrained  to  a  level  which  was  lOdB 
above  the  target’s  average  power  defined  as 

<r=j}  E  (56) 

XjGXtgt 

where  M  are  the  number  of  target  pixels  for  that  particular  target.  This  was  done  to 
reduce  large  glints  in  the  target  signature. 

The  target-to-clutter  ratio  (TCR)  of  the  images  are  determined  from  the  average 
powers  of  the  target  and  clutter  and  is  defined  as 

2 

TCR  =  (57) 

Vclut 

To  achieve  a  particular  TCR,  the  target  average  power  will  be  held  constant  whereas 
the  clutter  will  be  scaled  by  a^gt/TCR  (recall  that  the  original  a^ut  =  1).  The  targets 
are  then  embedded  into  clutter  with  the  appropriate  variance  for  the  chosen  TCR. 
The  TCR  used  in  our  studies  was  OdB. 

The  mask  used  to  define  the  extent  of  the  embedded  targets  are  also  used  to 
evaluate  the  output  of  the  detectors.  The  mask  is  used  to  determine  whether  or  not 
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Figure  9:  Portion  of  the  data  set  used  in  the  detector  performance  studies  showing 
embedded  targets  with  a  0  dB  target-to-clutter  ratio. 

a  detected  pixel  corresponds  to  a  true  target  pixel  or  is  a  clutter  false  alarm.  The 
probabilities  of  detection  and  false  alarm  are  then  estimated  from  the  detector  output 
as 


Pd 


Pfa 


#  of  detected  pi xels| target  ntgi 

#  of  target  pixels  Xtgt 

#  of  detected  pixels | clutter  n ja 

#  of  clutter  pixels  Xci ' 


(58) 

(59) 


The  finite  test  sample  size  effects  the  fidelity  of  the  processor  performance  esti¬ 
mates.  The  most  traditional  method  is  to  look  upon  the  decisions  at  each  pixel  as  an 
independent  Bernoulli  trial  [6j.  A  signal-to-noise  ratio  (SXRp)  of  the  estimate  can 
be  defined  as  the  mean-square  to  variance  ratio  and  is  given  by 


SXRp  = 


E2{P:} 


,2 


(60) 


Description 

Notation 

Single  Pixel/ Single  Res.  F  test 

Multi-res.  F  test(increments) 

Multi-res.  Pearson  corr.  test  (increments) 

Multi-res.  Pearson  corr.  test 

Multi-res.  AR  test  (AR  vs.  Brownian  motion) 
Multi-res.  AR  test  (increments,  AR  vs.  white  noise) 
Single  Res. /Multi-pixel  F  test 

M2X*i)) 
unz.j)) 
unij)) 
Ml te)) 
MX) 
Ml -) 
MZ) 

Table  1:  List  of  the  various  multiresolution  detection  schemes  used  in  the  performance 
studies. 


In  our  case  Ntgt  =  1.8rl04  and  Nci  =  2.5x10s.  The  false  alarm  estimates  will  have 
signal-to-noise  ratios  of  SNRpFA  =  2.5a:103  for  Pfa  —  10-2  and  SNRpFA  =  24  for 
Pfa  —  10-4  which  are  acceptably  large.  The  signal-to-noise  ratios  of  target  detection 
estimates  are  computed  in  a  similar  manner.  SNRpo  >  105  for  Pd  down  to  0.1  which 
is  also  extremely  large. 

5.2  Empirical  Results  -  Simulated  Data 

This  section  will  discuss  the  results  of  the  various  target  detection  schemes  when 
multiresolution  data  is  used.  These  detectors  were  applied  to  the  generated  scene 
discussed  in  Section  5.1.  Table  1  sumarises  the  detectors  used  in  this  study.  It 
should  be  stressed  that  the  Receiver  Operating  Characteristic  (ROC)  curves  shown 
in  this  subsection  are  based  on  the  pixel  level  detection  and  not  on  the  extended 
object.  Extended  object  detection  based  on  collected  data  is  presented  in  the  next 
subsection. 

The  tests,  used  only  the  signature  at  the  single  pixel  Xj,  hence  the 

notation  i>n(T(ZLj)) •  In  the  GLRT  tests  (0s(T),  tpe(T!)),  %t  was  a  3x3  pixel  area.  Xc 
was  a  hollow  ring  of  inner  diameter  20  pixels  and  outer  diameter  24  pixels.  The  inner 
diameter  was  chosen  to  be  approximately  the  size  of  the  target.  The  multi-pixel  F 
test,  also  used  a  3x3  region  for  Xt. 
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The  resolution  sampling  strategy  was  chosen  to  whiten  the  multiresolution  in¬ 
crements  process  as  discussed  in  Section  2.  We  used  the  same  resolution  sampling 
strategy  for  both  the  original  multiresolution  process  and  the  multiresolution  incre¬ 
ments  process.  The  resolution  studies  only  examined  azimuth  resolution  as  a  running 
variable.  Range  resolution  was  fixed  at  1ft.  This  was  due  to  the  conjecture  that 
multiresolution  processing  may  mitigate  the  requirement  for  fine  resolution  collection 
in  azimuth.  This  has  implications  on  wide  area  search  scenarios  since  collection  time 
per  scene  is  directly  tied  to  azimuth  resolution. 

Figure  10  shows  the  detection  statistics  of  the  single  pixel  detection  schemes.  In 
each  case,  15  resolutions  were  used  ranging  from  1  to  10  ft.  The  multiresolution 
tests  provide  better  contrast  than  the  single  pixel  single  resolution  test,  V’l,  which  is 
our  baseline.  The  multiresolution  F  test,  ^2,  affords  the  best  contrast  between  the 
target  set  and  the  clutter.  tp2  and  03  show  a  smearing  of  the  signature  due  to  the 
multiresolution  processing.  if>4,  however,  does  not  exhibit  this  smearing.  Figure  11 
shows  the  empirical  Receiver  Operating  Characteristic  (ROC)  of  three  of  the  single 
pixel  tests:  the  single  resolution/single  pixel  F  test  ^>i,  the  multiresolution  F  test 
on  the  increments  process  -02 ,  and  the  Pearson  correlation  test  on  the  increments 
process  This  figure  shows  the  performance  gain  provided  by  the  multiresolution 
signatures.  The  multiresolution  F  test,  ^2,  provided  the  best  result.  Performance 
of  the  tests  increased  as  the  number  of  resolutions  used  increases.  This  is  seen  by 
examining  the  performance  of  when  6  and  10  resolutions  are  used.  Lastly,  note 
that  the  multiresolution  F  test  using  a  2  ft.  starting  resolution  performed  as  well 
as  the  single  resolution  F  test  using  1  ft.  data.  This  has  implications  regarding  the 
amount  of  aperture  and,  hence,  search  rate,  of  a  SAR  system. 

The  decision  statistics  showing  the  increased  contrast  between  the  targets  and  sur¬ 
rounding  clutter  for  the  detection  schemes  using  spatial  data  T  are  shown  in  Figure 
12.  Note  that  the  spatial /multiresolution  GLRT  testing  for  an  AR  process  vs.  Brow¬ 
nian  motion,  ^>5(T),  has  the  best  contrast.  Figure  13  shows  the  Receiver  Operating 
Characteristics  of  the  spatial  tests  ^5,^65  Vv  and  compares  them  to  the  baseline  tpi- 
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^(n*.))  Single  Res.  1ft.  ^^'(-iy)  l-10ft.,N=15 


Figure  10:  Decision  statistics  of  single  pixel  tests. 
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Figure  11:  ROC  results  for  various  single  pixel  multiresolution  detection  schemes 
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ip 5  performed  the  best  and  significantly  outperformed  the  single  resolution  F  test,  rpi, 
and  the  spatial  F  test  ipr-  The  spatial  F  test  and  the  multiresolution  increments  test 
ipQ  performed  equivalently.  The  reduced  performance  by  ipe  is  due  to  the  assumed 
iid  driving  noise  process  in  the  target  process  model.  We  expect  that  if  that  test  is 
reformulated  such  that  the  appropriate  driving  noise  process  is  accounted  for  that 
its  performance  will  be  at  the  level  of  or  better  than  ips.  Histograms  of  the  various 
decision  statistics  are  shown  in  Figure  14.  These  histograms  show  the  separation 
increase  between  the  target  and  clutter  signatures  when  multi-resolution  processing 
is  applied. 

The  next  set  of  figures  will  explore  some  of  the  specific  tests  more  fully  and  bring 
out  some  salient  features  of  multiresolution  processing.  Figures  15  and  16  show  the 
performance  gain  as  the  number  of  resolutions  used  in  the  F  test,  xpi,  increases.  Using 
a  starting  resolution  of  1  ft.,  there  is  a  significant  gain  in  performance  going  from  5' 
to  10  resolutions.  The  performance  gain  is  smaller  when  15  resolutions  are  used.  At 
15  resolutions,  the  performance  gain  is  effectively  saturated.  Starting  from  1.5  ft. 
resolution,  the  amount  of  performance  gain  is  severely  reduced.  The  performance  is 
effectively  saturated  using  10  resolutions.  Figures  17  and  18  show  similar  results  for 
the  P  test. 

Figures  19  and  20  show  the  performance  degradation  of  the  P  test  and  F  test  as 
the  starting  resolution  is  degraded.  For  both  the  F  and  P  tests,  starting  resolutions 
up  to  1.5  ft.  performed  as  well  as  or  better  than  the  single  resolution  F  test  baseline 
detector.  At  coarser  starting  resolutions,  the  multiresolution  tests  outperform  the 
single  resolution  test  at  moderate /high  false  alarm  rates. 

Figures  21  through  24  show  the  performance  of  the  tests  when  the  starting  res¬ 
olutions  were  varied.  The  number  of  resolutions  were  held  constant  at  15.  In  each 
case,  the  performance  of  the  tests  degrade  as  the  starting  resolution  is  coarsened.  At 
a  starting  resolution  of  1.5  ft.,  the  multiresolution  F  test,  rp2,  and  the  Pearson  cor¬ 
relation  test,  ip3,  performed  the  best  and  were  essentially  equivalent.  They  exhibited 
a  significant  performance  gain  over  a  single  resolution  F  test.  At  2  ft.  starting  reso- 
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Figure  12:  Detection  statistics  for  the  various  spatial  multiresolution  strategies. 
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Figure  13:  ROC  performance  of  the  various  spatial  multiresolution  strategies. 
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Figure  14:  Histograms  of  the  decision  statistics  for  the  various  strategies. 
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Figure  16:  ROC  performance  of  the  multiresolution  F  test  as  the  number  of  resolu 
tions  are  increased.  Starting  resolution  is  1.5ft. 
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Figure  17:  ROC  performance  of  the  multiresolution  P  test  as  the  number  of  resolu¬ 
tions  are  increased.  Starting  resolution  is  1ft. 
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Figure  18:  ROC  performance  of  the  multiresolution  P  test  as  the  number  of  resolu 
tions  are  increased.  Starting  resolution  is  1.5  ft. 
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Figure  19:  ROC  performance  of  the  multiresolution  F  test  as  the  starting  resolution 
is  coarsened. 


Figure  20:  ROC  performance  of  the  multiresolution  P  test  as  the  starting  resolution 
is  coarsened. 
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Figure  21:  ROC  performance  of  the  multiresolution  tests  for  a  starting  resolution  of 
1.5  ft. 

lution,  multiresolution  gain  is  only  exhibited  at  moderate/high  false  alarm  rates.  At 
a  starting  resolution  of  3  ft.,  there  was  practically  no  performance  gain  provided  by 
the  multiresolution  tests.  Lastly,  at  a  starting  resolution  of  5  ft.,  the  multiresolution 
tests  performed  worse  than  the  single  resolution  test.  The  degraded  performance  at 
the  coarse  starting  resolutions  is  to  be  expected  since  the  law  of  large  numbers  will 
become  applicable  to  the  target  signatures  in  this  resolution  regime. 
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Figure  24:  ROC  performance  of  the  multiresolution  tests  for  a  starting  resolution  of 
5  ft. 
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5.3  Empirical  Results  -  STAR  Data 

In  this  subsection,  we  describe  the  results  obtained  from  applying  the  various  mul¬ 
tiresolution  detection  schemes  on  collected  data.  The  STAR  algorithm  described  in 
Section  4  is  brought  to  bear  on  this  problem.  The  STAR  data  set  was  also  enhanced 
using  ERIM  DCS  data  from  Grayling,  MI.  These  data  sets  were  ground  truthed  with 
respect  to  the  type  and  position  of  the  targets  within  each  scene.  The  scoring  that  we 
will  show  in  this  subsection  is  based  on  the  detection  of  the  extended  targets  and  not 
the  pixel-by-pixel  scoring  that  occurred  in  the  previous  subsection.  All  non  target 
detections  were  counted  as  false  alarms  regardless  of  their  cause  (e.g.  placed  trihe- 
drals,  other  cultural  objects,  etc.).  No  spatial  filtering  was  performed  in  the  detection 
stage  to  eliminate  small  false  alarms.  The  peak  value  of  the  decision  statistic  within 
a  detection  blob  was  used  as  the  cue  rating  factor  to  generate  the  empirical  ROC 
curves. 

The  data  sets  used  in  these  studies  were  provided  by  the  Lincoln  Laboratory 
ADTS  system  and  the  ERIM  DCS  system.  Both  systems  are  fine  resolution  SAR 
systems.  The  ADTS  data  is  the  standard  data  set  that  was  used  in  the  Strategic 
Target  Algorithm  Research  (STAR)  program.  This  data  set  was  used  so  we  could 
compare  our  results  with  previous  detector  developments.  This  data  set  contained 
numerous  military  vehicles  in  various  deployments  and  netting  conditions.  There  were 
approximately  780  target  realizations  in  the  ADTS  data  set.  This  set  was  augmented 
by  the  DCS  collections  in  Grayling,  Aberdeen  and  Eglin.  An  additional  260  target 
realizations  were  made  available.  The  clutter  data  that  was  used  was  from  the  ADTS 
sensor.  This  clutter  included  both  natural  and  cultured  clutter.  Approximately  750 
square  km  of  clutter  was  used  in  the  studies. 

The  TCR  gain  provided  by  multiresolution  processing  on  the  data  sets  is  shown 
in  Figure  25.  This  figure  shows  a  rank  ordering  of  the  TCR  gain  between  the  single 
resolution  F  test  and  the  multiresolution  F  test  for  both  the  ADTS  and  DCS  data 
sets.  We  define  the  TCR  gain  through  a  general  statistical  distance  metric  called  the 
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Hellinger  distance.  For  the  two  hypotheses 


Ho  :  Clutter  xp  PoW 

Hi  :  Target  xp  ~  p\  (xp) 


the  distance  (TCR)  between  the  target  and  clutter  distributions  is  defined  as 


1  r  1/2 

TCR=H(Pa{i,)Mt))  =  ^[j(rll\'l’)-rTWfd1>\  •  (61) 


For  two  univariate  Gaussian  distributions  where  po(xp)  ~  N{na  &c),  Pi(ip)  ~  N(pti  &t), 
the  Hellinger  distance  simplifies  to 


H{po{xp),Pi{xp))  =  -^= 


/ 


2(1  +  72) 


exp 


f  (jM  ~  /^c)2  1 
\  2cr2(72  +  1)  J 


(62) 


where  7  =  of  /<r^.  Other  metrics  such  as  Mahalanobis  distance  assume  that  the  mean 
value  or  variance  are  equal  between  the  two  hypotheses. 

We  calculate  the  Hellinger  distance  (TCR)  between  pairs  of  target  and  clutter 
areas  of  the  decision  statistics  of  the  detection  schemes.  For  the  single  resolution 
case  we  call  this  Hsres.  The  multiresolution  case  is  called  Hmrea.  We  compare  the 
Hellinger  distance  on  the  same  clutter/target  areas  to  compute  gain.  The  clutter 
areas  chosen  were  natural  clutter  areas.  Each  target  realization  had  a  distinct  clutter 
area  associated  with  it.  The  Hellinger  distance  is  an  “amplitude”  measure  of  distance. 
Therefore,  the  gain  in  dB  is  computed  as 


Gain  (dB)  =  20  log  Hmres  —  20  log  Hsres .  (63) 


We  found  an  average  of  4.2  dB  of  TCR  gain  due  to  multiresolution  processing  for  the 
ADTS  data  and  an  average  gain  of  3.7  dB  for  the  DCS  data. 

Figure  26  and  27  shows  the  ROC  performance  of  the  various  tests  on  the  STAR 
data  set.  Figure  26  compares  the  performance  of  various  multiresolution  detectors 
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and  compares  it  to  the  single  resolution  STAR  baseline  algorithm  fa.  ip4  performed 
the  best  of  all  the  tests  having  both  the  smallest  false  alarm  rate  and  the  largest 
detection  rate,  fa  was  next  followed  by  fa,  The  baseline  single  resolution  test,  fa, 
performed  the  worst  both  in  terms  of  detection  rate  and  false  alarm  rate.  In  all  cases 
the  starting  resolution  was  1ft.  15  resolutions  between  1  and  10  ft.  were  used  in 
the  multiresolution  tests.  The  detection  performance  of  the  single  resolution  detector 
saturated  at  Pd  ~  .55.  This  is  due  to  the  setting  of  the  pixel-by-pixel  threshold.  No 
pixels  were  detected  on  some  dim  targets.  This  threshold  remained  constant  when 
the  multiresolution  process  was  used.  The  multiresolution  detectors  saturated  at 
Pd  ~  .78.  The  enhanced  detection  performance  demonstrates  the  TCR  gain  afforded 
by  the  multiresolution  processing.  The  same  conclusions  can  be  derived  from  the 
DCS  data  set  and  is  shown  in  Figure  27. 

Figure  28  shows  the  performance  of  fa  and  ip4  as  a  function  of  starting  resolution. 
As  resolution  is  coarsened  from  1  to  3  ft.  the  single  resolution  detector  fa  has  degraded 
detection  performance.  The  pixel-by-pixel  threshold  was  held  constant.  Therefore, 
as  resolution  is  coarsened  there  is  less  peak  target  energy.  For  all  three  resolutions 
considered  in  this  study,  the  multiresolution  detector  significantly  outperforms  the 
single  resolution  detector.  The  multiresolution  detector  provides  better  detectability 
at  3  ft.  than  the  single  resolution  detector  at  1  ft. 

5.4  Results  -  Discrimination 

This  subsection  describes  the  results  of  the  discrimination  stage  of  the  algorithm. 
The  results  obtained  in  this  section  are  based  on  the  STAR  collected  data  set.  Otir 
discrimination  algorithm  is  a  false  alarm  rejection  algorithm  based  on  spatial  features 
derived  from  the  binary  detection  maps  that  the  STAR  detection  algorithm  provides. 

5.4.1  Feature  Extraction 

Here  we  discuss  the  rationale  and  computation  of  the  features  extracted  for  each 
aggregate  blob.  Bear  in  mind  that  these  features  are  selected  based  on  their  potential 
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Figure  25:  Target  to  clutter  ratio  gain  afforded  by  multiresolution  processing. 
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Figure  26:  ROC  performance  of  the  multiresolution  tests  applied  to  the  STAR  data 
set. 
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Figure  27:  ROC  performance  of  the  multiresolution  tests  applied  to  the  DCS  data' 
set. 
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to  cluster  differently  on  targets-of-interest  than  on  other  objects.  Table  2  summarizes 
the  features  extracted. 


Type 

Feature 

spatial 

mass 

diameter 

square-norm,  rotational  inertia 

Table  2:  List  of  features  extracted  for  each  aggregate  blob. 


Detection  Score 
Discrimination  Score 


Figure  29:  Detection/discrimination  algorithm  flow  of  multiresolution/STAR  algo¬ 
rithm. 

The  three  spatial  features  provide  distinct  measures  of  coarse  spatial  properties 
of  an  aggregate  blob.  Mass  is  the  number  of  pixels  in  the  blob.  Diameter  measures, 
roughly,  the  maximum  linear  dimension  of  the  blob,  in  pixels.  The  diameter  feature 
actually  is  the  integer  part  of  the  length  of  the  diagonal  of  a  horizontally  or  vertically 
oriented  rectangle  which  just  encloses  a  blob,  rounded  to  the  nearest  integer.  For 
example,  the  diameter  feature  for  a  5  x  7  pixel  rectangle,  oriented  horizontally,  is  8. 
Square-normalized  rotational  inertia  (SNRI)  is  the  second  moment  of  the  blob  pixel 
coordinates  about  the  blob  center  of  mass  (its  inertia),  divided  by  the  inertia  of  an 
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equal-mass  square  blob.  The  expression  for  SNRI  is: 

fi  M-i 

SNRI  =  M(M-1)  + 

where  the  ( xm ,  ym)  are  the  coordinates  of  the  mth  pixel  on  the  blob,  and  (x,  y)  are 
the  coordinates  of  the  blob  center  of  mass,  and  M  is  the  blob  mass.  The  SNRI  for  a 
horizontally-oriented  L  xW  pixel  rectangle  is  |  x2^^~2.  As  the  length/width  ratio 
(aspect-ratio)  of  the  rectangle  increases,  the  SNRI  approaches  half  the  aspect-ratio. 
In  general,  SNRI  increases  as  higher  proportions  of  blob  pixels  become  distant  from 
the  center  of  mass.  Filled  circular  blobs  exhibit  low  SNRI  (less  than  one),  while  long, 
thin,  or  annular  blobs  exhibit  high  SNRI.  Squares  exhibit  unit  SNRI. 

5.4.2  Discrimination  Algorithm  /Results 

The  discrimination  algorithm  that  we  employed  in  these  studies  was  a  feature  based 
tree  structured  classifier.  We  only  examined  binary  spatial  target  features  since  we 
conjectured  and  found  that  the  multiresolution  process  provides  a  higher  per  pixel 
detection  rate.  We  found  that  the  extended  target  had  more  pixel  hits  on  it  using 
multiresolution  data  than  for  the  single  resolution  case.  We  therefore  conjectured  that 
the  binary  spatial  features  would  be  a  powerful  and  simple  discriminant  feature  set. 
We  took  the  tree  structured  approach  due  to  its  nonparametric  nature.  There  axe  no 
compelling  reasons  known  to  the  authors  that  would  lead  to  any  specific  parametric 
model  for  the  three  spatial  features  cited;  especially  a  multivariate  Gaussian  model. 

Tree  structured  classification  approaches  axe  nonparametric  in  nature.  By  this  we 
mean  that  no  prior  statistical  model  is  assumed  for  the  classes  of  interest.  Rather 
the  processor  is  trained  from  collected  data.  Tree  structured  approaches  have  one 
great  advantage  over  other  nonparametric  algorithms,  however.  For  large  training 
sets  tree  structured  approaches  have  been  shown  to  converge  to  the  optimum  per¬ 
formance  (minimum  probability  of  error)  produced  by  the  Bayes  classifier  [29].  This 
states  that  the  tree  structured  classifier  would  attain  the  performance  of  a  parametric 


processor  where  the  underlying  models  were  completely  specified  and  correct.  Many 
nonparametric  algorithms  do  not  exhibit  this  characteristic  since  they  must  assume 
a  priori  the  order  of  the  processor  and  hope  that  the  data  will  conform  to  this  order. 

A  tree  structured  algorithm  is  a  sequence  of  binary  decisions  on  data  to  extract 
information  comprehensively  and  rapidly.  A  decision  tree  is  shown  in  Figure  30.  The 
tree  is  constructed  by  repeated  splits  of  the  feature  space  X  into  subsets.  By  feature 
space,  we  mean  the  space  of  all  possible  measurement  vectors  (a  measurement  vector 
consists  of  an  ordered  group  of  observables  e.g.  the  vector  of  binary  spatial  features 
of  an  extended  target  detection).  As  shown  in  the  figure,  each  split  is  binary  (the 
feature  space  is  split  into  two  subsets).  The  decision  point  where  a  split  occurs  is 
called  a  node. 

Trees  are  normally  grown  via  a  steepest  descent  type  algorithm.  These  algorithms 
are  formally  equivalent  to  the  K-means  algorithm  [30].  These  algorithms  are  iterative 
and  try  to  optimally  split  the  feature  space  into  two  distinct  areas  at  each  node.  A 
minimum  missclassification  error  criteria  is  used  as  the  measure  of  performance  at 
each  node.  An  initial  set  of  partitions  are  selected.  The  missclassification  rate  is  then 
estimated.  The  partitions  are  perturbed  until  the  minimum  missclassification  rate  is 
found  at  that  node.  This  procedure  is  stepwise  optimal  [31]  (no  other  procedure  can 
do  better  with  respect  to  minimum  error  rate  at  this  node). 

The  decision  of  when  to  declare  a  node  terminal  is  based  on  a  “purity  measure.” 
Each  node  splits  the  feature  space  making  the  resultant  data  “more  pure.”  Numerous 
purity  measures  exist.  One  of  the  most  often  used  is  [29]  is  the  entropy  of  the  data 
at  that  point.  If  the  purity  of  the  data  does  not  change  by  a  predetermined  amount 
that  node  is  defined  as  a  terminal  node. 

Class  labels  are  easily  affixed  to  terminal  nodes.  The  training  and  tree  growing 
procedures  are  supervised.  Therefore  each  input  data  vector  has  a  label  for  each 
of  the  classes  of  interest.  A  simple  voting  routine  where  the  class  with  the  largest 
number  of  training  data  residing  in  that  partition  defines  the  label  of  the  terminal 
node. 
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Figure  30  depicts  the  results  obtained  by  running  the  tree  structured  discrimina¬ 
tion  algorithm.  As  shown,  for  the  single  resolution  case,  only  173  of  a  possible  297 
targets  were  given  to  the  tree  structured  classifier.  This  is  due  to  the  lower  detection 
rate  of  the  single  fine  resolution  detector  ipi.  In  addition,  365  false  alarms  were  pre¬ 
sented  to  the  detector  from  the  single  resolution  data.  Conversely,  the  AR  test  xj) 3  had 
a  much  higher  detection  rate  so  that  261  of  the  297  targets  were  presented  to  the  tree 
structured  classifier.  Its  lower  false  alarm  rate  provided  287  flase  alarms.  A  classifica¬ 
tion  tree  was  grown  and  optimized  for  the  single  resolution  and  multiresolution  cases. 
The  results  show  that  for  the  single  resolution  case,  20  false  alarms  were  still  classified 
as  target  while  4  targets  were  rejected  as  false  alarms.  For  the  multiresolution  case, 
14  false  alarms  were  classified  as  target  and  0  targets  were  rejected  as  false  alarms. 
These  results  lend  credence  that  the  multiresolution  detector  provides  more  target 
“fill”  on  extended  targets  (higher  per  pixel  detection  rate)  which  provides  a  better 
discrimination  for  spatial  the  features  used  here.  This  extra  “fill”  can  be  exploited 
more  fully  in  a  discrimination  algorithm  than  that  found  in  the  single  resolution  case. 
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Single  Res.:  AR  test: 

173/297  tgt’s  ,  261/297  tgt’s 

365  fa’s  287  fa’s 


Mass 


Single  Res:  20  fa’s  classified  as  target 

4  tgt’s  classified  as  false  alarms 
AR  Test:  14  fa’s  classified  as  target 

0  tgt’s  classified  as  fase  alarms 


Figure  30:  Results  of  the  discrimination  algorithm  on  both  the  single  resolution  and 
multiresolution  data. 
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6  Conclusions 


In  this  section  we  briefly  summarize  the  main  points  of  the  current  research.  We  have 
attempted  to  explore  multiresolution  processing  as  a  means  of  providing  high  per¬ 
formance  target  screening  in  SAR  data.  We  are  exploiting  the  interference  between 
prominent  scatterers  in  a  resolution  cell  as  our  discriminant.  The  hope  is  that  the  in¬ 
terference  will  provide  a  characteristic  signature  change  as  resolution  is  changed.  We 
formulated  statistical  models  for  both  clutter  and  target  multiresolution  signatures. 
We  showed  that  the  multiresolution  clutter  process  was  a  Brownian  motion  process. 
Using  the  property  of  independent  increments  for  Brownian  motion  processes,  we  de¬ 
rived  a  simple  resolution  sampling  strategy  that  whitens  the  clutter  process.  Based 
on  these  models,  we  developed  a  number  of  multiresolution  tests.  These  tests  con¬ 
sisted  of  a  Generalized  Likelihood  Ratio  approach,  composit  hypothesis  tests  and  a 
generalized  matched  filter.  We  discounted  the  matched  filter  test  since  it  presupposed 
a  specific  target  signature  which  we  didn’t  feel  conformed  to  the  spirit  of  first  stage 
screening. 

Examining  synthetic  data,  we  found  that  the  multiresolution  detector  far  out¬ 
performs  single  resolutions  detectors  on  a  per  /pixel  basis.  A  Generalized  Likelihood 
Ratio  approach  using  a  local  target  area  provided  the  best  result.  We  noted  that  the 
performance  of  the  multiresolution  detectors  saturated  at  approximately  15  resolu¬ 
tions.  We  also  noted  that  when  the  starting  resolution  was  coarsened,  the  performance 
of  the  multiresolution  detector  suffered.  Performance  also  saturated  at  fewer  resolu¬ 
tions.  However,  at  2ft  resolution,  the  multiresolution  strategies  performed  as  well  as 
or  better  than  single  resolution  strategies  at  1  ft.  In  our  studies  we  kept  the  range 
resolution  fixed  at  1  ft.  This  allowed  us  to  explore  the  collection  aperture  implications 
of  the  multiresolution  study. 

We  used  the  multiresolution  detection  strategies  as  a  substitute  for  the  first  stage 
of  the  Strategic  Target  Algorithm  Research  (STAR)  algorithm  which  we  have  used  as 
a  means  of  detection/discrimination  of  extended  targets  in  clutter.  We  applied  the 
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multiresolution  approaches  to  Lincoln  Laboratory  ADTS  data  and  ERIM  DCS  data 
encompassing  both  targets  and  clutter.  We  found  that  the  multiresolution  approach 
provided  significantly  better  detectability  of  targets  (.8  vs.  .55)  and  better  false  alarm 
performance  (4  vs.  6  false  alarms /fcm2).  These  results  were  consistent  with  our  TCR 
findings.  We  found  that  using  the  multiresolution  detection  schemes,  an  average  4  dB 
target  to  clutter  ratio  gain  was  obtained  when  the  DCS  and  ADTS  target /clutter  data 
sets  were  examined.  The  multiresolution  GLRT  provided  the  best  results  in  our  study. 
We  also  found  that  the  multiresolution  detectors  performance  was  surprisingly  robust 
to  starting  resolution  out  to  3ft.  Significant  detectability  gains  were  encountered  there 
(.75  vs.  .48). 

We  found  that  the  multiresolution  detectors  also  had  a  much  larger  pixel-by¬ 
pixel  detection  performance  than  the  single  resolution  scheme  for  collected  extended 
targets.  This  manifests  itself  in  spatial  features  that  can  be  used  for  target  discrimi¬ 
nation.  We  constructed  a  tree  structured  classifier  as  the  basis  of  our  discrimination 
algorithm.  We  applied  the  target  detections  and  false  alarms  provided  by  the  single 
resolution  and  multiresolution  detection  schemes.  We  found  that  the  discrimination 
algorithm  provided  much  better  performance  based  on  the  multiresolution  data  than 
that  of  the  single  resolution  data.  Using  the  multiresolution  schemes,  no  targets  were 
rejected  as  false  alarms  and  273  of  287  false  alarms  were  rejected.  For  the  single 
resolution  case,  only  169  of  297  targets  were  detected  with  20  false  alarms.  This  im¬ 
plies  that  multiresolution  processing  may  provide  spatial  features  that  provide  high 
performance  discrimination  capability. 

The  appendices  have  a  detailed  derivation  of  a  multiresolution  sampling  strategy 
for  a  generalized  matched  filter  detector.  Solutions  for  specific  SAR  impulse  responses 
are  provided.  This  analysis  shows  that  a  significant  performance  gain  can  be  obtained 
over  single  resolution  and  dyadic  multiresolution  strategies  (wavelets).  This  analysis 
would  be  used  as  a  precursor  to  a  multiresolution  target  classification  algorithm. 
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Appendix  A  Resolution  Sampling  Strategy 


In  the  multiresolution  problem,  the  objective  is  to  use  images  of  various  resolutions 
in  order  to  enhance  taxget  detectability.  A  natural  first  question  is: 

-  given  a  known  target  or  a  known  class  of  targets,  what  are  the  optimal  resolu¬ 
tions  to  be  using  in  order  to  maximize  the  performance  of  your  detector? 


The  purpose  of  this  appendix  is  to  provide  an  analytical  solution  to  this  problem 
under  the  assumption  of  a  completely  known  target  signature.  In  fact  we  actually 
solve  a  more  general  problem  in  that  we  derive  a  strategy  which  is  optimum  when 
taking  into  account  target  detectability  and  the  costs  of  processing  the  images,  i.e., 
every  new  resolution  requires  some  new  processing  to  form  an  image  at  that  resolution 
and  our  general  solution  takes  this  into  account  also. 

As  stated  earlier,  there  has  been  a  great  deal  of  work  on  understanding  the  wavelet 
transform  as  a  deterministic  operator  on  square-integrable  functions  and  to  our  knowl¬ 
edge  very  little  work  has  been  done  on  wavelet  transforms  in  a  stochastic  environment, 
which  is  what  we  have.  Because  of  our  framework,  it  would  be  very  interesting  to 
research  properties  of  wavelets  in  this  more  general  stochastic  setting. 

We  make  some  further  assumptions  on  the  clutter  process  W  and  thermal  noise 
process  N0(-\p),  in  that: 

(al)  W  is  a  white  circular  complex  Gaussian  spatial  process  in  R2  with  intensity 
cr^,  i.e.,  for  disjoint  bounded  sets  A\, . . . ,  A*  C  R2, 


R 


W(y)dy  :  1  <  j  <  k\ 


(A  —  1) 


are  independent  circular  complex  Gaussian  random  variables  with 


E 


f  W{y)dy 

JAi 


=  cr2  area  (A,) 


1  <j<k. 


(A -2) 


(a2)  for  fixed  p,  N0(- ;  p)  is  a  spatially  white  circular  complex  Gaussian  process. 
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(a3)  for  a  fixed  x  the  process  N0(x;  •)  is  a  circular  complex  Gaussian  process  with 

£(JV„(z;  p)N:(x;  (,'))  =  (A  -  3) 

max|p,p'j 

(a4)  W  and  N0  axe  independent  of  each  other. 

For  real  stochastic  processes  on  an  interval  T0  of  a  positive  time  axis,  there  is  a 
notion  of  a  Brownian  motion  process  with  parameter  a2.  A  real  stochastic  process 
{X(t)  :  t  G  T0}  is  said  to  be  a  Brownian  motion  with  scale  parameter  a2  if  it  is  a 
mean  0  Gaussian  process  with  stationary  independent  increments  and  E(X2(t))  =  cr2t 
for  every  t  G  T0.  Note  that  in  the  case  of  X  being  a  Brownian  motion,  have 

E{X{s)X{t))  =  <r2  min{s,  t}  (A  -  4) 


and  in  fact  this  characterizes  a  real  Brownian  motion  process  when  it  is  assumed  that 

the  process  has  mean  0.  We  now  want  to  generalize  the  notion  of  a  Brownian  motion 

process  to  the  case  where  the  process  is  a  complex  mean  0  Gaussian  porcess.  In  this 

case  we  say  a  complex  stochastic  process  {F(t)  :  t  €  T0 }  with  index  set  T0  C  [0,  oo) 

is  a  complex  Brownian  motion  process  with  parameter  <r2  if  it  is  a  circular  complex 

Gaussian  process  with  the  real  and  imaginary  parts  of  Y  being  independent  Brownian 

2 

motions  with  common  parameter 

The  assumption  in  (a3)  is  essentially  equivalent  to  the  assumption  that  N0  is  a 
complex  Brownian  motion  when  the  index  set  is  inverted  in  p.  More  specifically  if  we 
redefine  the  index  set  of  N0(p)  to  be  :  p  G  [puPu]}  and  define  the  process  Nr  on 
this  index  set  as 


Nr 


=  Mp) 


(A -5) 


then  Nr  is  a  complex  Brownian  motion.  This  is  the  essence  of  the  assumption  (a3) 
and  this  assumption  physically  is  related  to  the  fact  that  the  amount  of  thermal  noise 
is  proportional  to  the  length  of  the  aperture  and  this  is  inversely  proportional  to 
resolution. 
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Now  based  on  the  above  model,  we  want  to  derive  the  optimal  sampling  (in  resolu¬ 
tion)  strategy  for  discriminating  between  the  hypothesis  H0  and  Hi,  or  if  we  think  of 
I(x)  p)  as  a  stochastic  wavelet  transform  plus  noise,  then  this  amounts  to  choosing  the 
optimal  scale  parameters  of  the  noisy  wavelet  transform  process.  The  approach  we 
have  taken,  because  of  the  complexity  of  the  problems  is  to  fix  the  location  (or  known 
in  wavelet  jargon  as  the  translation  parameter)  x  which  we  can  assume  is  0,  and 
derive  the  optimal  sampling  strategy  of  the  noisy  wavelet  transform  process  I(p,x ) 
in  terms  of  p.  Of  course  this  is  not  a  fully  realistic  solution  of  the  original  problem 
(and  the  astute  reader  will  also  have  already  noted  other  unrealistic  simplifications 
which  will  be  discussed  in  more  detail  in  the  Summary).  But  we  hope  by  solving 
this  problem,  to  be  able  to  shed  some  light  on  the  analytical  choice  of  the  sampling 
strategy  and  to  provide  some  light  on  setting  up  and  solving  the  much  more  complex 
general  problem  of  considering  all  of  the  pixel  locations  simultaneously. 

As  stated  earlier,  for  the  sake  of  notational  simplicity  we  will  assume  without  loss 
of  generality  that  the  fixed  location  x  =  0  and  thus  we  are  trying  to  derive  a  sampling 
strategy  for  discriminating  between  H0  and  Hi  which  are  no  longer  a  function  of  x. 

In  doing  this  we  will  be  using  ideas  from  Cambanis/Masry  (1983),  hereafter  referred 
to  as  CM.  The  new  hypotheses  based  on  this  fixed  location  assumption  are  given  by 

H0  (clutter  only)  :  I{p)  =  [  h  f-')  W ( y )  dy  +  Na(p )  (A-6) 

VPJ  \Pj 

=  N(p) 

and 

Hi  (target  present)  :  I(p)  =  /h  ( - )  (W(y)  +  gt(dy))  dy  +  N0(p) 

VP J  \PJ 

=  jtfh{i)9l(y)dy+  ljtJh  (*)*<*>*+*« 

=  S(p)  +  N(p)  (A-7) 

where  we  have  actually  substituted  h( — )  in  place  of  h  in  the  above  equations.  This 
was  done  since  it  simplifies  notation  and  it  makes  absolutely  no  difference  to  the  final 
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results.  Also  recall  that  the  index  set  for  the  process  in  both  (A-6)  and  (A-7)  are 
resolutions  p  in  the  range  [pi,  pu].  By  the  above  representations,  we  see  that  A  is  a 
circular  complex  Gaussian  process  with  the  autocorrelation  function 


R(f,p')  =  E{N(p)N-(p')) 


W(y')dy'  +  K(p')\ 
(A-8) 


This  is  also  the  autocovariance  function  of  the  process  S(p)  +  N(p)  under  the  as¬ 
sumption  of  Hi .  The  above  formulation  is  quite  similar  to  the  problem  considered  in 
CM  (1983)  with  the  only  differences  being  that  instead  of  having  complex  Gaussian 
processes,  they  considered  only  real  Gaussian  processes  and  they  made  a  technical 
assumption  on  the  signal  process  S  relative  to  the  autocorrelation  function  R.  Specif¬ 
ically  they  made  the  following  assumption: 

(a5)  the  signal  process  S  satisfies:  3  a  square  integrable  function  /  over  [pi,  pu] 
such  that 

I*  RleJWW  =  S(P).  (A -9) 

J  Pi 

The  assumption  (a5)  (and  previous  assumptions),  imply  that  even  if  one  knew  I  over 
the  whole  interval  [pi,pu\,  it  is  not  possible  to  perfectly  discriminate  between  H0  and 
Hi,  i.e.,  between  clutter  only  and  clutter  plus  target.  It  also  implies  that  the  optimal 
detector  (for  a  target  being  present)  using  values  of  I  across  all  resolutions  in  the 
range  of  [p0,Pi]  is  given  by 

Accept  Hi  and  declare  a  target  if  9?  (^J  I(p)f*(p)dp^j  >  T  (A  —  10) 

where  T  is  a  threshold  and  9?  denotes  the  real  part.  The  above  is  stated  in  Cam- 
banis/Masry  (1983)  for  the  real  case  and  is  a  straightforward  corollary  of  Karhunen- 
Loeve  expansion  (discussed  in  Appendix  A. 4)  and  some  analysis  using  the  expansion 
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to  determine  the  likelihood  ratio.  This  is  done  in  detail  in  Grenander  (1980)  (cf. 
chapter  5).  One  can  think  of  the  test  in  (A-10)  as  a  matched  filter  detector. 

Of  course,  in  our  case  we  do  not  have  a  continuum  of  data,  but  rather  we  must 
choose  a  finite  number  of  resolutions.  This  motivates  the  determination  of  a  sampling 
strategy  for  the  resolutions  which  gives  detection  performance  as  close  as  possible  to 
the  performance  of  the  detector  in  (A-10).  Before  specifying  such  a  sampling  strategy, 
we  mention  an  interesting  property  of  the  stochastic  process  {I(p)  :  p0  <  P  <  Pi} 
which  is  true  given  some  realistic  assumptions  on  the  impulse  response  function  h  (for 
example  these  properties  are  true  if  h  or  its  Fourier  transform  is  a  sine  function).  The 
property  is  that  under  certain  conditions  on  the  mother  wavelet  function  h,  have: 

/  is  a  Markov  process  in  both  the  positive  and  negative  directions  of  p,  i.e.,  the 
conditional  distribution  of  I(pn )  given  I  (pi), . . . ,  /(p„_i)  only  depends  on  I(pn- 1)  in 
the  cases  of 

Pl<P2<-..<Pn  (A  —  11) 

or 

pn  <  pn-l  <  •  •  •  <  pl-  (A  -  12) 

The  latter  ordering  is  probabily  the  more  natural  ordering  to  consider  since  the 
Markov  relationship  here  is  saying  that  if  one  has  a  sequence  of  images  where  the 
resolution  is  becoming  finer  and  finer,  then  the  conditional  distribution  of  an  even 
finer  resolution  image  is  only  dependent  on  the  finest  resolution  previously  consid¬ 
ered.  The  technical  assumptions  on  h  are  outlined  rigorously  in  Appendix  A.4  and 
also  given  there  is  a  careful  derivation  and  discussion  of  the  Markov  property  (P). 
Note  that  this  suggests  an  alternative  framework  from  which  to  investigate,  namely 
that  of  optimal  sampling  of  Markov  processes  for  the  purposes  of  detection.  This 
would  be  an  interesting  approach  which  is  significantly  different  from  the  approach 
presented  in  this  appendix. 
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A.l  First  Sampling  Strategy 

In  this  subsection,  we  set  up  a  framework  for  specifying  what  we  mean  by  optimal 
resolutions  from  a  target  detection  viewpoint.  We  then  use  this  framework  and  some 
analysis  to  develop  a  numerical  algorithm  for  determining  the  optimal  resolutions. 

To  set  up  the  framework  consider  a  fixed  number  of  resolutions,  say  n,  and  we 
suppose  the  objective  is  to  find  the  optimal  resolutions  pi,p2, • . .  ,pn  to  choose  so 
that  based  on  I(pi), . . . ,  7(p„),  we  would  have  optimal  detection  performance.  Before 
proceeding,  we  have  to  clearly  specify  what  is  meant  by  the  term  optimal.  The 
framework  we  will  use  is  as  follows.  We  specify  the  probability  of  false  alarm  to  be 
some  fixed  value  which  we  denote  by  Pfa-  For  any  finite  set  of  resolutions  pi,... ,  pn, 
there  is  some  achievable  probability  detection  given  the  probability  of  false  alarm  is  as 
specified.  We  denote  this  probability  of  detection  by  where  £n  =  (pi, .  ■  - ,  pn)- 

Mathematically  we  want  to  find  the  resolutions,  i.e.,  the  vector  which  satisfies  that 

MO  =  (a -13) 

We  denote  the  righthand  side  (RHS)  of  (A-13)  by  Pp.  As  discussed  in  CM  (1983), 
based  on  assumption  in  (a5),  there  is  an  upper  bound  on  the  probability  detection  Pp 
which  is  strictly  below  1.  It  is  essentially  the  probability  of  detection  for  optimal  test 
given  all  the  resolutions.  As  discussed  earlier,  this  test  would  be  based  on  thresholding 
the  real  part  of  the  statistic  /  I (p)f*(p)  dp  where  /  is  the  function  given  in  (a5).  We 
denote  the  maximal  probability  of  detection  for  this  test  by  Pp.  As  shown  in  CM 
(1983),  as  n  increases,  Pp  ->  Pp  It  turns  out  that  the  determination  of  which 
satisfies  (A-13)  is  exactly  equivalent  to  another  criteria,  namely  that  of  finding  the 
resolutions  which  maximize  what  is  known  as  the  generalized  signal-to-noise  ratio. 
This  is  discussed  in  greater  detail  in  Appendix  C.2,  which  is  essentially  results  from 
CM  (1983)  adapted  to  our  situation.  For  a  set  of  resolutions  £n  =  (pi, . .  .,pn)\  we 
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will  denote  the  GSNR  by  G(Pln)  and  it  is  defined  by 


g\p)  = 


(A  - 14) 


where  Ej  is  the  expectation  operator  under  Hj,  j  =  0, 1,  Var0  is  the  variance  operator 
under  H0,  and  I(P,n)  is  the  vector  [/(pi), . . .  ,I(pn)\,  the  term  f/(/(£n))  corresponds 
to  the  log  of  the  likelihood  ratio  (after  dropping  terms  which  do  not  depend  on  /(£n)) 
and  is  given  by 

(A  -  15) 

where  S( £  )  is  the  covariance  matrix  of  the  random  vector  [A^pi), . . . ,  N(pn)]t  (which 
is  the  same  under  both  H0  and  Hi),  i.e., 


(£(£.))«  =  E(N(Pi)N'(PI))  1  <  i,j  <n  (A  -  16) 

The  interpretation  of  the  GSNR  is  heuristically  simple.  It  is  simply  a  measure  of 
the  separation  of  the  distributions  of  the  log-likelihood  ratio  f?(/(£n))  under  Hi  and 
H0.  Based  on  the  maximization  of  the  GSNR  being  essentially  equivalent  to  the 
maximization  of  the  probability  of  detection,  we  take  an  analogous  approach  to  that 
taken  in  CM  (1983),  where  they  chose  £  to  maximize  the  GSNR.  It  is  useful  to 
have  an  alternative  interpretation  of  the  resolutions  which  maximize  the  GSNR.  By 
substitution  of  (A-15)  into  (A-14),  it  is  easy  to  see  that  the  GSNR  has  a  form  given 

by 

<?(£.)  =  (S*(£„))‘  (A  -  17) 

where  £(£  )  denotes  the  vector  of  [^(pi, . .  .^(p*,)]*.  The  RHS  is  a  quadratic  form 
where  the  matrix  E-1(£n)  is  positive  definite.  Hence  it  turns  out  that  one  can  think 
of  the  RHS  of  (A-17)  as  an  inner  product  of  S(£n)  with  itself  as  long  as  you  think  of 
the  right  inner  product.  This  inner  product  is  actually  defined  in  Appendix  A.4.  Even 
more  important  than  this  we  can  think  of  S(P,n)  as  a  projection  of  the  continuous 
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function  S  in  this  Hilbert  space.  Thus  we  can  write  the  GSNR  in  (A-17)  according 
to  the  following  expression: 

02(&J  =  II Vp  sife  (A  -  18) 

where  ||  •  ||/j  is  the  norm  in  this  more  general  Hilbert  space  and  V p  is  a  projection 
operator.  For  notational  convenience  we  now  drop  the  subscript  R  on  the  inner 
products  and  norms,  since  all  such  refer  to  this  special  Hilbert  space.  This  Hilbert 
space  is  known  as  the  Reproducing  Kernel  Hilbert  Space  (RKHS)  and  the  nicety  of 
it  is  that  evaluations  of  the  process  at  a  time  instant  corresponds  to  a  projection 
in  this  Hilbert  space.  For  the  sake  of  intuition  it  is  entirely  appropriate  to  think 
of  the  projections  we  will  be  describing  as  occurring  in  finite  dimensional  space.  In 
Appendix  A.4,  we  give  a  more  thorough  discussion  and  background  on  RKHS’s  and 
their  applications. 

Based  on  (A-18)  and  previously  quoted  results  (outlined  in  Appendix  C.2),  one 
optimal  algorithm  (for  resolution  selection)  proceeds  by  selecting  =  (/>°1? . . .  ,p°n) 
as  the  n  resolutions  which  satisfies 

=  sup  G*(Pn) 

=  (A-19) 

p 

~r> 

The  above  maximization  is  over  P,n,  i.e.,  over  n-dimensional  space  and  in  every  it¬ 
eration  of  this  maximization,  one  must  compute  an  inverse.  Both  of  these  factors 
suggests  that  for  moderate  to  large  values  of  n,  the  above  maximization  may  be  dif¬ 
ficult  to  carry  out  numerically.  This  provided  motivation  in  CM  (1983)  to  consider 
alternative  schemes  for  selecting  resolutions  which,  though  not  optimal  for  any  finite 
number  of  resolutions  n,  are  optimal  in  an  asymptotic  sense.  A  number  of  schemes 
were  presented  depending  on  the  properties  of  the  autocorrelation  function  R,  and 
one  of  the  cases  they  considered  included  the  case  of  noise  process  being  an  indepen¬ 
dent  increments  mean  0  Gaussian  process.  As  proven  in  Appendix  A.4,  if  we  assume 


that  h  is  sine  function,  our  noise  process  corresponds  to  a  circular  complex  Gaussian 
process  with  independent  increments,  but  not  stationary  increments.  We  conjecture 
that  the  sampling  results  presented  in  CM  (1983)  have  very  close  analogous  results 
in  this  case.  Thus  we  present  the  results  in  CM  (1983)  suitably  modifed  for  our  case 
as  a  heuristic  algorithm  which  is  likely  to  have  some  similar  asymptotic  optimality 
properties  as  stated  in  CM  (1983),  though  some  further  research  is  needed  here  to 
verify  that  this  extension  is  really  valid.  The  procedure  is  to  choose  a  probability 
density  function  ip  whose  support  is  in  the  interval  [pi,pu].  Then  at  stage  n,  we  select 
n  quantile  values  . . .  ,phn}  where 


1  <  j  <  n. 


The  result  proven  in  CM  (1983)  (for  the  real  case)  is  that 


(A  -  20) 


n2(Pl  -PS)^K  jT  dp  (A  -  21) 

where  A'  is  a  known  constant  and  fj  is  a  known  function  related  to  the  autocorrelation 
function  R.  In  CM  (1983),  they  derived  a  sufficient  condition  for  the  selection  proce¬ 
dure  to  be  asymptotically  optimal  and  this  condition  was  that  ip  be  proportional  to 

Wl2)’- 

As  stated  earlier,  we  conjecture  that  a  result  very  close  to  the  above  is  in  fact 
true,  and  so  we  view  the  above  as  a  useful  heuristic  for  deriving  a  simpler  resolution 
selection  algorithm.  One  technical  snag  is  that  it  may  be  difficult  to  identify  exactly 
the  function  /  which  satisfies  the  integral  equation  given  by 

jR(p,p')f(p')dp'  =  S(p).  (A -22) 


But  since  we  only  want  to  know  the  approximate  shape  of  /  in  order  to  derive  a  ip 
which  is  approximately  optimal  (i.e.,  select  ip  proportional  to  (/9|/|2)s ),  it  probably 
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is  useful  to  find  a  numerical  approximation  for  /  given  our  knowledge  of  R  and  S . 
This  we  can  easily  do  by  discretizing  the  problem,  i.e.,  choose  pi, . . .  ,pn  where 


Pj  =  Pi  + 


j(pu  ~  Pi) 


n 


1  <  j  <n 


(A  -  23) 


and  then  noting  that  (A-22)  after  discretization  can  be  reformulated  as 
R(pi,pi)  •"  R{PliPn ) 


Pi  ~  pu 
n 


R^PniPn) 


’  f(pl) ' 

'  5(0.)  ' 

.  fiPn)  . 

.  5(0 n)  . 

(A  -  24) 


Now  one  just  solves  this  equation  numerically  for  f(pi), . .  . ,  /(/>n)  and  then  use  these 
values  and  some  numerical  fitting  routine  for  determining  an  approximation  to  / . 


A. 2  General  Sampling  Strategy 

In  this  section  we  generalize  the  sampling  strategy  to  take  into  account  the  processing 
costs.  We  assume  that  the  costs  of  processing  are  linearly  inversely  related  to  resolu¬ 
tion.  However  it  should  be  mentioned  that  our  discussion  could  be  carried  over  to  a 
more  general  class  of  loss  functions  (than  linear  in  the  inverse  of  resolution).  To  set  up 
the  framework  we  need  some  notation.  For  the  n-dimensional  vector  of  resolutions, 
jP  ,  we  denote  the  linear  span  of  the  autocorrelation  functions  •);  1  <  j ^  <  n} 

by  L(£n).  We  now  assume  for  a  fixed  n  that  we  want  to  find  resolutions  PJn  which 
minimize 


j- 1  Pj 

=  K'  (||S||2  -  (£*(£„))'£-■  (£JS(£J)  +  K  £  i  (A-25) 

j—\  PI 

=  UIJ 
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or  equivalently  are  trying  to  maximize 

-K'tj  (A  -  26) 

j=l  rj 

Here  K ,  K'  are  positive  constants  which  reflect  the  proportional  costs  of  the  two 
differences.  Now  the  first  term  of  US'D2  —  ||'Pi,p  .PH2  converges  to  0  at  a  rate  which 

is  proportional  to  the  rate  at  which  Pp  —  Pp  converges  to  0,  as  was  stated  in  the 
previous  subsection.  Because  it  is  easier  numerically  to  handle  (than  Pp  —  Pp),  we 
base  our  loss  function  on  ||P||2  —  ||Pi  (p  . 5* || 2 •  Since  we  can  divide  the  loss  function 

'N-'n 

by  the  constant  K'  and  not  change  its  basic  structure  (optimal  resolutions  remain  the 
same),  we  can  without  loss  of  generality  assume  that  K'  =  1  and  so  we  only  need  to 
specify  the  constant  K.  Now  note  that  as  n  gets  large  the  loss  function  goes  to  oo, 
so  there  is  actually  some  optimal  number  of  resolutions  n*  and  corresponding  set  of 
n*  optimal  resolutions  p\, . . .  ,p^  which  satisfies  that 

MA  •••,(>:.)  =  “f  “f  U&J  (A  -  27) 

~n 

This  set  of  resolutions,  which  we  denote  as  a  vector  £*  ,  then  represents  the  optimal 
set  of  resolutions  across  all  possible  finite  sets  of  resolutions  after  taking  into  account 
the  probability  of  detection  and  the  costs  of  processing  each  image.  Now  K  is  related 
to  how  one  views  costs  of  having  a  lower  probability  of  detection  versus  paying  for 
more  processing  of  images,  but  as  stated  earlier  the  first  term  in  the  loss  function, 
of  US'D2  —  HP.  .p  .^H2  is  asymptotically  proportional  to  the  difference  Pp  —  Pp(£,n) 
which  is  the  maximal  probability  of  detection  minus  the  probability  of  detection  based 
on  using  the  resolutions  pi, . . . ,  pn.  This  is  described  in  more  detail  in  Appendix  C.2 
where  the  precise  proportionality  constant  is  given  and  this  may  be  useful  for  choos¬ 
ing  the  constant  K.  For  this  loss  function  and  for  n  €  N,  ^t  P°n  =  [p0nl,---,P°nn]t 
be  a  vector  of  resolutions  which  minimizes  the  loss  function  Ln(P  )  as  a  function  of 
n.  Again  it  should  be  noted  that  the  numerical  determination  of  P^  can  be  difficult 
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for  moderate  to  large  values  of  n  for  identical  reasons  as  mentioned  in  the  previous 
subsection.  Thus  in  the  context  of  moderate  to  large  values  of  n,  it  would  be  useful 
to  again  derive  simpler  alternative  selection  algorithms  as  was  done  in  the  previous 
subsection  for  the  special  case  when  the  wavelet  transform  process  had  independent 
increments.  This  would  be  a  good  area  for  future  research. 
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A.3  Solution  to  Resolution  Sampling  Strategies 
Introduction 


This  section  discusses  the  optimal  selection  of  a  subset  of  resolutions  from  a  multiresolution  signal 
to  maximize  the  detection  probability.  We  will  call  this  a  resolution  sampling  strategy.  It  is  not  pos¬ 
sible  to  use  every  resolution,  because  without  an  iterative  structure  in  place,  the  dimensions  of  the 
problem  quickly  become  too  large.  A  solution  to  this  will  be  discussed  later.  The  organization  of 
this  section  is  as  follows:  First  we  will  pose  the  general  problem  of  detecting  a  known  signal  in 
additive  noise  and  discuss  the  solution.  Next,  an  approach  based  on  the  work  of  Cambanis  and 
Masry  (1983)  for  selecting  a  discrete  set  of  resolutions  at  which  to  form  the  detection  statistic  is 
described.  The  criteria  to  evaluate  performance  of  the  detector  using  the  subset  of  resolutions  are 
also  provided.  Both  the  continuous  and  discrete  solutions  are  provided.  Then,  the  multiresolution 
SAR  problem  is  introduced,  and  the  form  of  the  covariance  function  (which  is  required  by  the  so¬ 
lution  formulation)  is  derived.  In  addition,  the  Markov  property  is  verified  and  the  remaining  con¬ 
ditions  required  for  the  derivation  of  the  sampling  strategy  are  developed.  An  example  target  is 
described  next,  and  the  sampling  strategies  using  both  the  continuous  and  discrete  solution  strate¬ 
gies  are  obtained  and  compared.  Finally,  the  performance  of  the  method  based  on  the  Cambanis 
and  Masry  approach  is  compared  to  other  methods  of  selecting  sampling  strategies,  and  the  satu¬ 
ration  behavior  as  the  number  of  resolution  samples  increases  is  investigated. 


Problem  Statement  and  Solution  Development 

„2 

As  before,  we  will  denote  the  complex  valued  SAR  image  at  pixel  location  x  E  m  which  takes 
into  account  resolution  p  by  T  (x,  p  )  .  We  would  like  to  consider  the  problem  of  detecting  a  mul¬ 
tiresolution  signal  T  (x,  p)  embedded  in  multiresolution  additive  noise  T  (x,  p)  at  a  given 
location  x  =  ^ q over  a  given  resolution  interval  p^  <  p  <  p^,  i.e., 

HyT(x,  p)  =  Ts(x,  p)  +Tc(x,  p)  ,p/<p<pM 
HQ:T(x,p)  =  Tc(x,  p)  ,p/<p<pM 
The  covariance  of  the  noise  process  is  denoted  R  (  p,  p')  ,  and 

Vp’p,>  -£[rc<*  p>*v*p)]|i_i 


The  optimal  test  statistic  is 
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Pm 

<P  =  jf(xQ,p)T(xQ,p)dp 

Pi 


where  f{x^,  p)  is  the  generalized  multiresolution  matched  filter  found  by  solving  the  integral 
equation 

ru 

Ts (xQ,  p)  =  J\(P’  P’)/(%pW- 
P  / 

When  the  solution  is  to  be  found  in  the  discrete  domain,  the  process  T  (.Xq,  p)  is  sampled  at  N 
points  (a  method  to  select  the  points  will  be  described  below)  {  p  p  p2>  •  •  •  P jy}  to  form  the  pro¬ 
cess 

t 

I(%P)  =  {T(x0,pl),T(xQ,p2),  ...T(xQ,pN)}  . 

The  test  statistic,  denoted  <p^  to  indicate  its  dependance  on  the  N  sample  points,  then  becomes 

9#  =//K%P) 


where 

(  =  s"1  (p,  p')  r  (*„  p)  • 


-1 


and  7?  (  k,  i )  ,i,j  =  I...N  is  the  inverse  matrix  of  R  ( p p  ) ,  i,  j 

Xq  -0  1  J 


=  1..JV. 


To  find  the  optimal  set  of  {  p  •}  directly  would  entail  a  large  dimensional  search  over  all  subsets 
of  N  resolutions.  Cambanis  and  Masry  (1983)  show  that  if  R  (  p,  p')  is  not  differentiable  on  the 
diagonal  of  [  p p  ]  x  [  p ,,  p  ]  ,  then  selecting  the  N  quah tiles  of  a  function  h*  (.Xn,  p)  ,  de- 
fined  by 


(Xq,  p)  00  P  ^0’  P^-^  ^0’  P^] 


1 

3 
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will  result  in  an  asymptotically  optimal  solution  for  the  number  of  resolutions  being  used.  The 
function  (3  (Xq,  p)  is  given  by 

P(f0,p)  =RJo,(p,p-0)-^(p,p  +  0) 

where 


If  the  noise  is  stationary. 


dR 

O'  dp 


*o 


0 


Cambanis  and  Masry  also  show  that  two  measures  of  performance  are  monotonically  related: 

P ^  ( (p)  ,  the  probability  of  detection  for  the  detector  (p  at  a  fixed  false  alarm  rateCC ,  and^  ( Cp)  , 
the  generalized  signal  to  noise  ratio  of  the  continuous-resolution  optimal  detector  cp .  S  (cp)  is 
defined  by 

r  u 

S2  (<\ p)  =  J^U0,  p)f(xQ,  p)dp 

Pi 

This  relationship  between  the  two  measures  is  given  by 

Pd(< P)  =  4>[s(<p)-4>''(l-a)] 

where  O  is  the  error  function.  In  discrete  form 

S2  (<pw)  =  p)Rft*(p,  P'JTjUq,  P)  and 

Pd(<fN)  =  ®[s(V-*_1(1-o)] 


Multiresolution  SAR  Sampling  Problem  Solution 

For  the  SAR  imagery  problem,  we  can  derive  the  functional  form  of  the  covariance  function 
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R  (p,  p‘)  ,  where  T c  (Xq,  p)  is  the  multiresolution  signature  of  the  clutter  after  it  has  passed 
through  the  SAR  imagery  system.  In  addition,  we  can  show  that  the  process  is  Markov,  and  is  re¬ 
lated  by  a  transformation  to  a  Brownian  motion  process. 


Assuming  that  the  system  has  impulse  response  h  (y)  ,  which  is  a  sine  function,  then 

vp>p,)  = 

A 

We  know  that  the  Fourier  transform  of  h  (y)  is  an  indicator  function,  denoted  h(u)  ,  and  also 
that 


where  the  double-sided  arrow  denotes  Fourier  transform.  So, 


Rx  (P>  P’) 

*0 


==  \h{ Vy 

rm'J  Vp J  Vp  J 


Vpp' 


and  thus  by  Parseval’s  Theorem, 

^(p,  p')  =  Jpp'jh(pu)h(p'u)du 


Now,  for  p  <  p'  via  properties  of  the  indicator  function, 

( p,  p')  =  */pp'jh(p'u)du 

RXJP, p')  =  Mjp'(h(p'u))du 

Vp’p'>  =  §\kiu')du' 

For  p  >  p' 

^o(p,p’)  =  Jpp'jh(pu)du 


Rio(p,p')  =  ^jp(h(pu))du 

Vp,p,)  =  $ihWdu' 
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Now,  assume  that  h  (u)  has  unit  energy,  i.e., 


1 


and  therefore  let 


A  =  jh(u)du 

Then,  the  covariance  R  (p,  p')  is  given  by 

aS’p<p' 

RIo<P>P’)  =  4p  =  p' 

aSp>p' 


Note  that  R  (  p,  p')  is  continuous  at  p  =  p' ,  but  it  is  not  differentiable. 


To  demonstrate  that  the  process  is  Markov,  the  correlation  function  must  satisfy  a  scaling  law  con¬ 
dition  with  respect  to  resolution  [Wong  and  Hajek].  Let  p  ^  <  P2  <  P3  •  Then, 

[p~i  K 

Ipl  ^\P2^\P3  SJ0^Pl’P2^Ri0^P2’P3^ 

MPl.Pj)  =A'-=  ’  ’  ~ 


R2Q(p2’  P2^ 


Thus,  the  process  is  Markov. 
To  find  (3  (Xq,  p)  ,  note  that 


flr'(P>P’)  = 


*0 


(-A)  Jo 
- — —S  P  <  P 

27? 

0,  p  =  p' 

A 


2j^' 


,P>P' 
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therefore 


P  <*0’  P>  =  7 

Now,  we  will  determine  the  transformation  which  maps  the  process  T  (Xq,  p)  into  Brownian 
motion.  Since  the  process  T£  (xQ,  p)  is  Markov,  its  autocorrelation  function  R*  (p,  p’)  satis¬ 
fies  the  equation  ~° 

Rx  ( u,t)R  ( t,s ) 

\iu’s)  =  -  r  «» — ’p'-s<,<uSp . 


Letting  U  =  p  and  rearranging, 

fti 


(*>$)  = 

*0 


where 


g  (i)  =  «fo  (p„,  *) 


and 


*(0  = 


*  (*»  o 

*0 _ 

VP*0' 


Let  T  be  defined  as 


T(f)  = 

£*U) 


and  Z  as 


Z(p)  =  k(p)Y(T(p))  ,pG  [pppj  • 


where  Y  is  a  circular  complex  Brownian  motion  process.  Then, 

RAP,  Pi  =  R  (P,  P-) 

Z 

We  have  described  an  invertible  transformation  which  generates  a  process  with  the  same  distribu¬ 
tion  as  the  one  we  have  been  considering  from  a  Brownian  motion  process.  Thus,  we  can  use  the 
inverse  transformation  on  the  original  process  to  generate  a  system  equation  which  is  driven  by 
Brownian  noise.  The  integral  equation  for  this  system  will  be  easier  to  solve,  and  the  solution  can 
be  transformed  back  to  give  a  solution  to  the  original  problem.  The  transformation  proceeds  from 
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the  original  equation 


T  (Xq,  p)  —  T s  (Xq,  p)  +  T c  (Xq,  p) 


to 


71i0>t'1(p)J  _  x'1  (p)  ) 

T(p)  *(P)  U°,p) 


or 


f(i0,p)  =  fs(i0,p)  +  ru0,p) 


This  problem  can  be  solved  for  f  (Xq,  p)  via  the  integral  equation,  the  optimal  sampling  strategy 
{ Pj- }  can  be  found  via  the  procedure  outlined  above.  This  transformed  set  of  resolutions  can  then 
be  converted  to  the  set  { p  ■ } . 

The  transformations  k  ( p)  and  T  (p)  can  be  determined  using  the  equations  given  above,  which 
are  defined  on  the  range  p^  <  p,  p’  <  Pu  (and  the  transformed  range  p^  <  p,  p'  <  PM). 


Three  Point  Target  Example 

It  is  instructive  to  attempt  to  find  the  generalized  matched  filter,  f  (Xq,  p)  ,  and  the  optimal  sam¬ 
pling  strategy  {  p . } ,  i  =  1 ...  iV  for  a  simple  target  and  a  given  N,  and  compare  the  results  from 
the  continuous  solution  to  those  obtained  from  the  discrete  solution.  Consider  the  case  of  three 
equal  amplitude  colinear  scatterers  at  locations  X .,  i  =  1 ...  3 ,  all  at  range  y  =  0  and  equidistant 
from  each  other.  The  target  t  (x,  y)  can  be  written  as 
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3 

t  (x,  y)  =  ]T  S(x-x.)S(y) 
i=  1 

The  multiresolution  signature  of  this  target  at  any  position  x  can  be  evaluated  using  the  SAR  sys¬ 
tem  equation,  and  is  2 

r.  l2%X  i 

Ts  U  p)  =  2^e  smc[—^-) 

i  =  1 


The  magnitude  of  the  multiresolution  target  signature  is  shown  in  Figure  A-l  at  £q  =  {  0,  0}  . 
Figure  A-2  provides  an  example  of  several  sample  multiresolution  clutter  signatures  for  compari¬ 
son.  Both  plots  show  magnitude  versus  resolution. 

In  this  simple  case,  the  integral  equation  for  the  solution  f  (.Xq,  p)  is  difficult,  if  not  impossible  to 
solve.  We  will  use  the  transformation  described  previously  to  solve  this  problem. 


After  applying  the  transformation,  target  representation  at  x  =  0  thus  becomes 


7>0’P) 


e 


illixl 


^ a  . 

sine 


r-Apxi 


A 

J 


The  integral  equation  becomes  - 

^  Pm 

fs(xQ,p)  =  Jfly(p,  p')f(x0,  p)dp' 


Pi 

where 


/?y(p,  p’)  =  min  (p,  p') 


We  can  rewrite  the  integral  equation  using  the  expression  for  Ry(p,  p') 

p  P  u 

rsu0.p)  =  Jp/(%  p  W  +  p  J/(%  p  W 


P/ 


Taking  the  partial  derivative  with  respect  to  p  of  both  sides  gives 
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Resolution 


Figure  A-l:  Target  magnitude  across  resolution. 


Clutter  paths 


Resolution 

Figure  A-2:  Examples  of  clutter  magnitude  across  resolution 


p « 

j^fs  (*Q,  p)  =  pf(xQ,  p)  +  Jf(x0,  p')  dp'  -  p/(x0,  p) 

P 


Pu 

j/(%  P’)  dp' 


P 


Taking  the  partial  derivative  again  with  respect  to  p  yields 

?2  a 

(%  P)  =  -/ (%  P) 

To  complete  the  solution,  the  quantiles  of 


h*  U0,  p) 


*.2 


pu0,  P)/  u0,p) 


1 

3 


/V 

where  P  (xQ,  p)  =  -1  (determined  from  Ry  (  p,  p')  )  must  be  determined.  SubstiUrting  the  re¬ 
sults  from  the  solution  to  the  integral  equation 


h*  (x0,  p) 


(■&Q5  p) 


which  can  be  shown  to  be 


The  quantiles  { p  • }  of  this  function  were  calculated  numerically,  then  transformed  back,  and  are 
(1.2, 1.6, 2.3,  3.2,Z  4.9, 10.0). 


For  the  three  point  scatterer  case,  the  discrete  solution  was  calculated  directly  from  T s  (.Xq,  p) 
and  R  (p,  p')  as  (1.1,1.2,1.6,2.3,3.4,10.0).  The  GSNR  and  ROCs  were  calculated  as  described 
above^ftid  are  shown  below  in  Figure  A-3. 
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Figure  A-3  shows  the  ROC  curves  (detection  versus  false  alarm  probability)  for  the  two  solutions 
described  above  -  exact  solution  to  the  three  point  scatter  problem  (solution  of  the  integral  equa¬ 
tion)  versus  discrete  solution  (inversion  of  sampled  covariance  matrix  to  determine  f).  The  original 
sampling  density  of  the  discrete  covariance  matrix  included  a  range  of  resolutions  from  1  to  10, 
sampled  every  0.1.  The  exact  solution  does  outperform  the  approximate  solution,  as  was  expected. 

Comparison  of  Cambanis/Masry  Approach  to  Other  Sampling  Strategies 

Figure  A-4  show  the  performance  of  the  discrete  Cambanis/Masry  method  to  determine  a  sampling 
strategy  for  the  three  point  target  problem  versus  the  multiresolution  sampling  strategy  used  in  the 
previous  project  (called  multires)  and  a  dyadic  splits,  or  wavelet  strategy.  The  dyadic  splits  are  de¬ 
termined  as  1  ft.,  2  ft.,  4  ft.,  etc.  The  number  of  resolutions  (N  =  6)  used  for  each  strategy,  and  the 
signal  to  noise  ratio  (determined  by  A)  were  held  constant.  Use  of  the  Cambanis/Masry  approach 
to  determine  the  sampling  strategy  resulted  in  increased  performance,  especially  over  the  wavelet 
sampling  approach.Saturation  Behavior  of  Cambanis/Masry  Sampling  Strategy  Approach 

Figure  A-5  illustrates  the  saturation  behavior  of  the  discrete  version  of  the  Cambanis/Masry  sam¬ 
pling  strategy.  A  new  “optimal”  sampling  strategy  was  generated  for  each  value  of  N,  the  number 
of  resolution  samples  used.  For  each  case,  the  finest  resolution  used  was  1  ft.,  the  coarsest  resolu¬ 
tion  used  was  10  ft.  For  the  N  =  infinity  case,  all  91  available  samples  were  used.  Little  improve¬ 
ment  is  seen  for  N’s  larger  than  12,  however  there  is  significant  improvement  beyond  the  single 
resolution  case  (N=l), 

Saturation  Behavior  of  Dyadic  Split  Sampling  Strategy  Approach 

Figure  A-6  shows  the  saturation  behavior  of  the  dyadic  split/wavelet  approach  to  sampling  (the 
wavelet  sampling  strategy  was  fed  to  the  same  GSNR/ROC  machinery  as  if  was  the  sampling  strat¬ 
egy  selected  by  the  C/M  mechanism).  Performance  saturates  at  lower  values  of  N  (N=  6  versus  N 
=  12)  for  the  wavelet  sampling  strategy,  and  the  same  levels  of  performance  are  not  achieved.  Note 
that  for  this  strategy,  each  increase  in  N  uses  a  coarser  resolution,  and  little  information  is  gained 
to  aid  performance  as  resolution  becomes  coarser. 
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solutions  for  the  optimal  sampling  strategy. 


Figure  A-4:  ROC  curves  comparing  different  sampling  strategies  at  a  fixed 
number  of  samples.. 


91 


0  0.01  0.02  0.03  0.04  J5.05  0.06  0.07  0.08  0.09  0.1 

Pfa 

Figure  A-6:  ROC  curves  for  wavelet  sampling  strategies  with  varying 
numbers  of  samples. 


92 


A.4  Summary 


In  this  appendix  we  have  given  a  methodology  for  choosing  the  optimal  resolutions 
from  a  target  detection  criteria.  We  did  this  under  two  Cases;  the  first  case  specified 
that  the  only  criteria  was  to  maximize  the  probability  of  detection  and  the  second  case 
specified  a  criteria  which  included  the  maximization  of  the  probability  of  detection  and 
a  minimization  of  processing.  The  technical  approach  was  to  model  the  resolutions 
from  a  wavelet  transform  viewpoint  and  allowing  the  scale  to  be  over  a  continuum  and 
restricting  the  sampling  location  to  be  at  a  fixed  point.  We  provided  an  algorithm  for 
determining  the  optimal  resolutions  under  both  criteria  and  we  stated  some  rigorous 
properties  of  these  optimal  resolutions.  We  also  derived  some  interesting  properties 
of  the  wavelet  transform  process  representing  our  different  resolution  images. 

Recommended  areas  for  future  research  are: 

(i)  Generalize  the  original  model  to  include  a  random  phase  on  the  target,  i.e., 
assume  a  model  where 


I(x]p)  =  4=  /  h  (- — -)  (1 W(y )  +  gt{y)exV )  dy  +  N0(x;p) 

Vpj\pJ 


x  €  Xp 
(A  -  28) 

where  V  is  random  variable  with  uniform  distribution  over  (0,  2tt)  and  is  independent 
of  the  clutter  process  W  and  the  thermal  noise  process  N0.  This  model  is  much  more 
realistic  since  we  never  expect  to  know  the  phase  of  the  target  reflectivity  (would 
require  precise  knowledge  of  the  radar  system  and  collection  geometry). 


(ii)  In  the  approach  in  this  appendix,  we  only  looked  at  one  pixel  location  and 
changing  resolutions  at  that  location.  The  more  realistic  scenario  is  to  look  at  optimal 
resolutions  based  on  considering  the  data  at  all  pixel  locations,  i.e.,  based  on 


{/( x)  p)  :  x  e  Xp}  pi  <P<  pu •  (A  -  29) 

This  would  mean  considering  the  full  wavelet  transform  both  in  the  scale  parameter 
and  in  the  translation  parameter.  One  of  the  problems  here  is  that  the  number  of 
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pixels  is  different  for  different  resolutions  and  so  there  is  a  problem  comparing  images 
of  different  sizes.  A  possible  starting  point  for  solving  this  problem  is  that  of  break¬ 
ing  up  the  scene  into  small  square  subpatches  which  are  much  smaller  than  the  finest 
resolution,  and  then  approximate  I(x;  p )  as  a  linear  combination  of  these  subpatches. 
This  would  be  a  sort  of  a  discrete  wavelet  transform  model-based  approach  to  the 
multiresolution  problem.  Another  possible  approach  to  this  problem  would  be  to 
look  at  different  metrics  relating  the  probability  distributions  of  two  different  images 
at  two  different  resolutions  (metrics  should  be  closely  related  to  the  probability  of 
detection),  which  don’t  need  the  same  number  of  pixels. 


(iii)  Investigate  simple  resolution  selection  algorithms  for  the  general  case  of  a  loss 
function  incorporating  the  costs  of  processing.  As  discussed  earlier,  there  are  such 
algorithms  in  the  case  of  no  processing  costs  when  the  general  noise  process  N  has 
independent  increments  (true  if  impulse  response  is  a  sine  function) . 


(iii’)  Derive  simple  resolution  selection  algorithms  under  the  framework  given  in 


(ii). 


(iv)  Develop  an  analgous  framework  (including  simple  algorithms)  to  that  pre¬ 
sented  in  the  appendix  in  the  case  where  Pj)  =  1.  Recall  that  we  used  the  generalized 
signal  to  noise  ratio  (GSNR)  and  the  justification  for  doing  so  was  that  the  resolutions 
which  maximize  this  (GSNR)  also  are  optimal  from  a  probability  detection  criteria. 
But  this  depended  on  the  assumption  of  (a5)  which  implied  that  Pp  <  1.  In  many 
cases  (e.g.,  such  as  delta  functions  as  part  of  the  target),  this  does  not  hold  and  it 
would  be  of  interest  to  research  what  happens  in  this  more  general  framework. 
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Appendix  B  Properties  of  Noisy  Wavelet  Trans¬ 
form 

In  this  appendix,  we  establish  a  couple  of  interesting  properties  for  the  I  process 
which  were  given  in  Section  2.  As  stated  there,  we  can  view  I  as  a  noisy  stochastic 
wavelet  transform.  The  first  property  we  present  is  that  the  autocorrelation  function 
R  is  continuous  provided  the  mother  wavelet  function  h  is  square-integrable,  which  we 
assumed  it  was.  The  second  result  we  prove  is  that  under  some  technical  conditions 
on  the  function  h ,  the  wavelet  transform  process  at  a  fixed  translation  is  a  Gaussian 
Markov  process  in  the  scale  parameter.  We  now  state  explicitly  the  first  result. 


Proposition  Appendix  B.l  Suppose  h  is  square-integrable  and  R  is  the  autocor¬ 
relation  of  the  wavelet  transform,  process  /.  Then  R  is  continuous. 


Proof  (sketch).  The  autocorrelation  function  is  given  by 


iW) 


dy  + 


ma  x{p,p'} 


(B  —  1) 


where  the  second  term  represents  the  autocorrelation  of  the  additive  thermal  noise 
process  and  the  first  term  represents  the  autocorrelation  of  the  integrated  clutter 
process  (convolved  with  scaled  version  of  h).  Now  to  prove  the  result  it  suffices  to 
show  that  the  first  term  is  continuous  in  the  argument  (p,p').  This  can  be  done  by 
a  very  common  analysis  trick  whereby  we  approximate  h  by  a  function  h0  which  is 
continuous  and  has  compact  support.  We  then  use  this  trick  and  the  Cauchy- Schwarz 
inequality  to  prove  the  desired  result.  The  details  are  messy  and  hence  axe  omitted. 


We  now  state  a  second  result  on  sufficient  conditions  on  h  for  ensuring  the  Markov 
property  for  I  in  both  directions  and  we  do  this  precisely  in  the  form  of  a  Proposition 
and  then  we  give  a  proof.  But  before  doing  so,  we  need  a  preliminary  result  which 
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is  essentially  given  in  Wong/Hajek  (1985),  but  which  we  state  as  a  Lemma  both  for 
completeness  and  because  we  do  need  to  adapt  their  result  to  the  case  of  circular 
complex  Gaussian  processes. 


Lemma  Appendix  B.2  Suppose  0  <  ti  <  tu  <  oo  and  suppose  X  =  {X(t)  :  ti  < 
t  <  /2}  is  a  circular  complex  Gaussian  process  which  satisfies  that, 

R(t,  t)>  0  ti<t<tu  (B  -  2) 


and  that 

p({s  :  ti  <  s  <  tu,  R(t,  s)  =  0})  =0  V  t  (B  —  3) 


where  \i  is  Lebesgue  measure  (for  intuition,  notice  that  this  technical  condition  is 
satisfied  if  for  every  t  the  function  R(t,s )  =  0  for  at  most  a  countable  set  of  s’s). 
Then  the  following  conditions  are  all  equivalent: 

(i)  X  is  forward  Markov. 

(ii)  X  is  reverse  Markov. 

(Hi)  the  autocorrelation  function  R  satisfies  that 


R(u,  s) 


R(u,t)R(t,s) 
R(t,t ) 


tl  <  s  <  t  <  u  <  tu.  (B  —  4) 


(iv)  X  has  the  same  distribution  as  Z  where  Z(t)  =  f(t)Y(r(t))  where  f  is  a  de¬ 
terministic  function,  Y  is  circular  complex  Brownian  motion,  and  r  is  non-decreasing 
function  from  [U,tu]  to  [0,oo). 


Proof.  By  the  general  theory  of  Markov  processes  (cf.  Wong/Hajek  (1985),  pp. 
65),  it  is  well-known  that  (i)  and  (ii)  are  equivalent.  Thus  we  only  need  to  verify  that 
(i),  (iii),  and  (iv)  are  equivalent.  Suppose  (i)  is  true  and  consider  s  <t  <  u.  Then 

2?p((u)|X(s))  =  (B  -  5) 
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by  Lemma  Appendix  B.3  which  follows.  Also  we  have  that 


E(X(u)|X(*))  = 


£(£(X(«)|X(s),X(i))|X(s)) 

£(£(X(U)|X(f))|X(s)) 

R(u,t ) 


prop,  of  cond.  expectations 
by  Markov  property  of  X 


R(t,t) 

R(u,t)  R(t,s) 


E(X(t)\X(s)) 
X(s). 


R(t,t)  R(s,s ) 

Since  X(s)  is  non-degenerate,  (B-5)  and  (B-6)  imply  that  (iii)  holds. 
Now  suppose  (iii)  is  true.  Then  we  first  claim  that 


(B-6) 


R(t,s )  ^  0 


for  every  s,t 


(B-7) 


To  see  that  this  is  true,  suppose  the  contrary,  i.e.,  suppose  that  3  s  <  u  such  that 


R(u,s)  =  0. 


(B-8) 


But  the  hypothesis  in  (iii)  implies  that 

R(u,t)R(t,s) 


R(t,  t) 


0 


Vt  6  (s,u). 


(B  —  9) 


But  this  would  imply  that  either  R(u ,  t)  =  0  or  R(t,  s)  =  0  for  every  t  €  (s,  u),  and  it 
is  easy  to  show  that  this  violates  the  assumption  contained  in  (B-3)  and  hence  have 
a  contradiction.  Thus  we  have  verified  the  claim  expressed  by  (B-7).  Now 


R(t,  s )  = 


R(tu,s)R(t,t) 


R(tu ,  t ) 

=  9(s)f{t) 


t  >  s 


where 


<j(s)  —  R(tu ,  s) 


(B-10) 


(B  -  11) 
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R(t,t) 

R(tu,t ) 


Now  let  t  be  defined  by 


(B  -  12) 


r(t)  = 


t  €  [t/j  tu\ 


(B  -  13) 


Clearly  r  is  real  and  positive  since 


g(t)f(t)  =  R(t,t)>  0 


(B  -  14) 


Finally  r  is  non-decreasing  since  for  s  <t,  have 


r(s)  = 


/*(«) 

jifW) 

“  /•(*)/(*) 

__  R(t,s) 

"  7 

^  R(t,t)R(s,s) 

-  \  IFmw 

g(*)  $(*) 

\  /(«)  \|  /(*) 

=  \/r(s)T(t). 


by  Cauchy-Schwarz  ineq. 


(B-1S) 


Now  let 


Z(<)  =  /(i)K(T(f)) 


(B  -  16) 


Clearly  Z  is  a  circular  complex  Gaussian  process  and  for  s  <  t, 


E(Z'(s)Z(t))  =  r(s)f(my(T(s))r(T(i))) 
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(B-17) 


=  f(s)f(t)r(s) 
=  g{s)f(t) 

=  R(t,s ) 


Hence  (iv)  has  been  verified. 

That  (iv)  implies  (i)  is  immediate  since  Y  has  independent  increments  and  so 
automatically  is  Markov,  and  by  recalling  that  /  is  a  function  which  which  never 
vanishes  (i.e.,  never  equals  0). 


Lemma  Appendix  B.3  Suppose  (U,  V)  is  G2 -valued  random  vector  which  is  circu¬ 
lar  complex  Gaussian  with  Ryu  =  E(VU*)  and  Rxju  =  E(UU*).  Then 


(a)  the  random  vector 

(B  —  18) 

Ruu 

is  independent  ofU. 

(b)  the  conditional  expectation  ofV  given  U  satisfies  that 


E(V\U )  = 


Rvu 

Ruu 


U. 


(B  -  19) 


Proof,  (b)  is  an  easy  consequence  of  (a)  and  (a)  is  well-known  result  obtained  by 
recognizing  that  any  linear  combination  of  ( U ,  V)  is  again  circular  complex  Gaussian. 


We  now  state  a  couple  of  lemmas  and  give  a  definition  before  giving  the  main 
result  (Proposition  Appendix  B.6)  of  this  appendix,  which  are  sufficient  conditions 
for  the  wavelet  transform  process  to  be  Markovian. 


Lemma  Appendix  B.4  Suppose  X  and  Y  are  two  independent  Markov  processes. 
Then  X  +  Y  is  Markov. 
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Proof.  Easy  consequence  of  the  properties  of  conditional  expectation. 


Lemma  Appendix  B.5  Suppose  {A(f)  :  0  <  ti  <  t  <  tu  <  00}  is  a  circular 

complex  Gaussian  process  which  is  Markov  and  whose  autocorrelation  function  R 
satisfies  that 

R(t,s)  =  R(s,s)>  0  s<t.  (B  —  20) 

Then  there  exists  non- decreasing  function  rj  from  [</,<«]  to  [0,oo)  such  that  X  has 
same  distribution  as  the  process  {F (r(f ))  :  t\  <  t  <  tu}  where  Y  is  a  complex 
Brownian  motion  process  and  in  particular  X  has  independent  increments. 


Proof.  This  is  a  straightforward  corollary  to  the  proof  of  Lemma  Appendix  B.2. 
Specifically  it  suffices  to  show  that  the  function  /  in  that  proof  is  1 .  But  recall  that 


m 


R(t,t) 
R(tu,t) 
R(t,  t) 
R(t,  t) 


by  def.  of  /  in  proof 
by  the  hypothesis 


Definition/Notation.  For  square  integrable  function  /  from  R2  to  (17,  we  denote 
the  Fourier  transform  by  /.  We  let  H  denote  a  special  class  functions  defined  by  h  €  TL 
if  and  only 


h(y^y2)  =  C'|yilfcl|y2|fc2i[alljs1](j/i)i[«2,/32](y2) 


VuV2,€  R.  (B  —  21) 


where  (7  >  0,  ^1,^2  >  0  and  0:1  <  0  <  (3\  and  a.<i  <  0  <  fc,  and 


lajfijiy*)  =  {  0 


Vi  €  [ajiPi] 
otherwise 


3  =  1,2. 


(B  -  22) 
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Proposition  Appendix  B.6  Suppose  have  the  wavelet  transform  process  I  (under 
H0)  where 

I(p)  =  J  Q  (W(y)  +  g,(y))dy  +  N.(p)  (B  -  23) 

where  W  is  Gaussian  white  noise  with  parameter  a(_  and  N0  is  a  circular  complex 
Gaussian  process.  Assume  the  following: 

(i) W  and  N0  are  independent. 

(ii)  the  process  N0(x  :  •)  is  a  circular  complex  Gaussian  process  with 


e(n0(x-,p)n;(x]p'))  = 


aNc 


max{p,  p'} 


pi<p,p’<p«.  (B  —  24) 


A 

(Hi)  either  h  £  Ti  or  h  €  %■ 

Then  : 

(a)  I  is  a  Markov  process  in  both  the  positive  and  negative  directions. 

(b)  in  the  special  case  of  either  h  or  h  are  equal  to  C'l[aii/g1](yi)l[a2i/j2](i/2)>  then  I 
has  independent  increments. 


Proof.  Let  V  be  the  process  given  by 

V{p)  =  Jh(f)  W(y)  dy  p,<p<  pu.  (B  -  25) 

By  the  independence  of  V  and  N0,  it  suffices  by  Lemma  Appendix  B.4  to  prove  that 
V  and  N0,  individually,  are  Markov.  But  N0  is  Markov  since  by  assumption  it  has 
independent  increments  (this  was  discussed  in  section  2)  and  by  invoking  Lemma 
Appendix  B.2,  (iii).  Now  to  prove  that  V  is  Markov,  we  will  show  that  the  assump¬ 
tions  specified  by  Lemma  Appendix  B.2  hold  and  also  that  condition  (iii)  in  Lemma 
Appendix  B.2  is  true.  Hence  it  now  suffices  to  show  that 


R{Pi  P1)  7^  0  f°r  aU-  Pi  /  (B  —  26) 
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and  that  (B-4)  holds.  It  is  easy  to  see  that  the  autocorrelation  function  for  V,  Ry,  is 
given  by 

RvM  =  ^Jh^\h'(^jdy.  (B  —  27) 

Assume  that  h  €H,  i.e., 


Mlhilfe)  =  C\yi\kl |y2|*2l[a1A](j/i)1[«2,/32](?/2)  VuV2  6  R  (B  -  28) 


where  ai  <  0  <  /3l5  a2  <  0  <  /?2  and  fci,  k2  >  0.  Then  by  (B-27)  and  the  above  form 
for  h ,  we  see  that  (B-26)  is  true.  Now  let  pi  <  s  <  t  <  u  <  pu.  Then 


Rv(u,s)  =  /»0  *•(»)* 


2C2 


( us)kl+k 2 


A* 


“‘l!/2l“!l|a> 


A]  (*' )  '[-.A]  (*  )  (”  )  Mn,H  (* 


<rlC* 

(us)Al+fc2 


Also  have 

Rv(u,t)Ry(t,s)  =  /?■  ft)  (fW*  OQ  ft1  OQ  V 

Rv(t,t)  /|ft  (!f)f  dv" 

_  oic*  (nj.i  £•  \vi^_ <hu)  (nl,.  xff  fcr<  iW) 

nU$\y>\2iidyi  ^ 

By  (B-29)  and  (B-30),  we  see  that  (B-4)  is  satisfied  in  this  case.  We  omit  the  proof 

A 

for  the  case  that  h  satisfies  h  €  H.  It  essentially  just  uses  Parseval’s  theorem  and 
then  invokes  the  equality  just  proved.  This  then  proves  (a). 

To  prove  (b),  since  N0  has  independent  increments  and  is  independent  of  V,  it 
suffices  to  prove  that  V  has  independent  increments.  First  consider  the  case  where  h 
is  as  above  with  ki  =  k2  =  0.  Then  by  Lemma  Appendix  B.5,  to  prove  the  result,  it 
suffices  to  show  that 


Rv(u,s)  =  Ry(s,s)  u  >  s.  (B  —  31) 


dyi  dy2 
(B-29) 
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A 

But  this  is  immediate  by  inspection  of  (B-29).  Again  the  case  where  h  has  the  above 
form  is  handled  by  Parseval’s  theorem. 

Remark.  The  above  result  shows  that  in  the  case  where  h  is  a  sine  function 
(i.e.,  represents  the  Fourier  transform  of  a  square  aperture)  or  h  is  a  rect  function, 
our  wavelet  transform  process,  /,  is  an  independent  increments  Gaussian  process.  By 
the  proof  given  above,  we  have  actually  shown  something  slightly  stronger.  We  have 
shown  under  the  hypothesis  assumed  in  part  (b)  of  the  Proposition,  that  /  is  a  time 
scaled  complex  Brownian  motion.  Specifically  there  exists  a  non-decreasing  function  77 
from  [puPu]  to  [0,  00)  such  that  /(/>)  has  the  same  distribution  as  the  process  Y(r](p )) 
where  Y  is  a  complex  Brownian  motion.  For  some  desired  extensions  of  results  given 
in  this  appendix,  this  assumption  of  h  or  its  Fourier  transform  being  a  rect  function 
may  provide  us  with  a  nice  framework  for  achieving  those  extensions.  There  is  a  lot 
known  about  Brownian  motion  and  its  properties,  and  there  is  a  lot  known  about 
doing  statistical  inference  for  the  case  of  a  signal  in  Brownian  motion.  Our  comment 
here  is  simply  that  this  may  be  very  beneficial  for  future  research. 
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Appendix  C  Reproducing  Kernel  Hilbert  Spaces 
Representation 


The  purpose  of  this  Appendix  is  to  give  some  background  on  Reproducing  Kernel 
Hilbert  Spaces  and  how  they  can  be  used  to  represent  arbitrary  second  order  stochas¬ 
tic  processes.  This  representation  turns  out  to  simplify  and  make  transparent  many 
arguments  (even  thought  at  first  glance  this  might  not  appear  to  true).  This  repre¬ 
sentation  is  used  several  places  in  the  main  body  of  the  appendix  and  also  is  used 
in  the  next  appendix  where  we  discuss  and  verify  some  properties  of  the  generalized 
signal-to-noise  ratios  stated  in  subsections  A.l  and  A.2. 

Background.  We  suppose  that  X  —  {Xt  :  t  e  /}  is  a  second  order  stochastic 
process  with  mean  0  and  covariance  function 

R(s,  t)  =  E(X(s)X*(t ))  s,t  el  (C  -  1) 

where  /  is  an  interval  in  R.  Note  we  are  allowing  for  the  possibility  that  X  is 
47- valued.  We  actually  discuss  two  representations;  the  Karhunen-Loeve  expansion 
(K-L  expansion)  and  the  reproducing  kernel  Hilbert  space  representation  (RKHS 
representation).  We  first  make  some  notes  with  regards  to  these  representations. 

(a)  The  K-L  expansion  is  applicable  when 

J  J(R(t,s))2  dtds  <  oo.  (C  —  2) 

(b)  The  RKHS  approach  is  applicable  in  general,  but  in  these  notes  we  only  give 
this  representation  under  the  conditions  invoked  for  the  K-L  expansion. 

(c)  All  of  these  representations  induce  an  isomorphism  between  the  Hilbert  space 
generated  by  linear  combinations  of  { X(t )  :  t  €  T1},  Hx,  and  some  other  Hilbert 
space.  One  of  the  main  purposes  of  doing  this  is  that  it  allows  one  carry  out  certain 
projection  operations  in  'Hx  by  trying  to  carry  them  out  in  the  other  Hilbert  space 
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(where  they  may  be  much  easier  and/or  transparent)  and  transfer  them  back  via  the 
isomorphism. 

(d)  Much  of  the  material  in  these  notes  are  contained  in  Grenander  (1980)  and 
Wahba  (1990). 

C.l  Karhunen-Loeve  Expansion. 

Here  we  suppose  that  R  is  continuous  and  that 

J  J(R(s,t))2  ds  dt  <  oo.  (C  —  3) 

Then  it  is  easy  to  show  that  the  linear  operator  of 

</>  €  L2  -»•  J  R{s,-)<f>{s)  ds  e  L2  (C  -  4) 

is  bounded  and  compact.  Thus  by  Mercers  Theorem  (cf.  Grenander’s  Abstract  In¬ 
ference ),  we  see  that  there  exists  continuous  orthonormal  functions  </>i , -  •  and 
eigenvalues  Aj  >  A2  >  •  •  •  >  0  such  that 

J  R(s ,  t)<j>k(s)  ds  =  \k<t>k{t)  k  e  W,  t  £  R  (C  -  5) 

=  S,te  R  (C  —  6) 

it 

where  the  convergence  is  uniform  over  compact  sets  and 

j  =  (C  —  7) 

Now  let 

Zk  =  yJ\kj  X(t)<}>k(t)dt  fceN,  (C-8) 
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where  the  integral  exists  in  the  L 2  sense.  Then  using  the  above  conclusions  of  Mercer’s 
Theorem,  we  see  that  Zi,  Z2, . . .  is  a  orthonormal  sequence  of  random  variables  and 


X(t)  =  f)  y/hZkMt) 

Jk=i 


tel  (C  -  9) 


where  the  sum  on  the  RHS  exists  in  the  L 2  sense. 

Remarks,  (a)  It  is  easy  to  show  that  Hx  C  Hz  and  in  the  case  that  {<f>k}  forms 
a  complete  orthonormal  basis  then  we  have  equality.  Thus  in  this  latter  case,  it  turns 
out  that  we  have  found  a  countable  complete  orthonormal  basis  for  fix,  namely  {Zk}- 

(b)  We  have  to  find  those  </>fc’s  by  solving  the  integral  equation. 

(c)  Under  the  conditions  specified  in  (a),  we  see  that  we  have  an  isomorphism 
with  £2,  since  Hx  =  {Dfc  &kZk  :  a  €  £2}  and  so  the  isomorphism  map  is 

a  G  £2  — >  akZk • 

k 

C.2  Reproducing  Kernel  Hilbert  Spaces 

Here  we  again  suppose  the  same  framework  as  in  the  K-L  expansion.  Now  we  let 

k  A* 

For  f,g  e  Hi,  we  define  the  inner  product  by 

</,«> 


(c  - 11) 


(C  - 12) 


(c  - 10) 


Proposition,  (i)  (Hi  i(i)i)is  a  Hilbert  Space, 
(ii)  f  e  Hi  implies  /  is  continuous. 
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(in)  If /€  Mi,  then 


f  £  T^i,  i  6  R.  (C  13) 


(iv)  Hi  and  Hx  are  isomorphic  under  the  extension  of  the  mapping 


K  K 

Y2  QkX(tk)  ^  akR(tk,')- 


k=l 


k=l 


(C  -  14) 


Remark.  The  above  is  essentially  proved  in  Wahba  (1990).  But  the  verification 
of  (iii)  is  quite  easy.  In  particular 

(f,R(t,-))  = 

k=i  *k 

k  >* 

=  z :</,*>*(<) 

k=  1 

=  /(<)• 

Also  to  support  the  isomorphism  statement,  we  note  that 

(X(s),X(t))  =  R(s,t) 

=  (R(s,  •),«(<,■)> 

and  we  then  can  use  the  linearity/ continuity  properties  of  the  inner  products  to  show 
the  isomorphism. 

Remark.  One  of  the  main  reasons  to  set  up  the  RKHS  approach  can  be  motivated 
by  trying  to  set  up  another  Hilbert  space  isomorphism  and  see  why  it  is  not  the  right 
embedding.  Specifically  suppose  we  tried  to  set  up  an  identification  with  L 2  by  taking 

/GL2->  J  f(t)X(t)dt  (C  —  15) 
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and  associating  the  inner  product  of 

(f, 9)2=J  j  f(s)g(t)R(s,t)dsdt.  (C  -  16) 

This  last  expression  is  truly  an  inner  product  and  it  is  true  that 

E  ((/  f(t)X(t)M)  (J g(t)X(t)dfy  =  (C  - 17) 

We  should  note  that  (,)2  is  truly  an  inner  product  modulo  one  small  detail  that  it 
may  be  possible  that  for  (/,/)2  =  0  without  /  being  0.  This  small  discrepancy  is 
insignificant.  However  there  is  another  detail  about  this  identification  which  is  not. 
Specifically  this  identification  will  not  exhaust  %x ,  i.e.,  not  every  element  in  'Hx  can 
be  represented  by  a  stochastic  integral  given  by  J  f(t)X(t)  dt  for  /  €  L2.  The  reason 
for  this  is  intuitive  by  the  following  argument.  Let  Z\,Z-i, ...  be  the  orthonormal 
sequence  as  given  in  K-L  expansion  and  let  a  £  P  be  such  that 

=  (C  — 18) 

k  *k 

Since  Afc  — >  0,  this  is  always  possible.  It  will  turn  out  that  there  is  no  way  to  represent 
Z  €  'Hx  by  an  integral  /  f(t)X(t)dt  since 

£>iZi  =  |>t  f^X(t)dt.  (C  —  19) 

*=1  k= 1  J  V'U 


But  we  see  that  YlT=i  D°t  in  L2. 


Remark.  Note  that  in  general,  except  for  degenerate  situations,  Mx  is  not  equal 
to  L2(a(X)),  i.e.,  'Hx  is  not  equal  to  the  space  of  functions  ip{X)  which  are  square 
integrable  functions  (E(tp2(X))  <  00).  This  is  easy  to  see  even  in  the  Gaussian 
case.  Specifically  it  is  not  too  hard  to  show  that  all  the  random  variables  in  Tdx 
are  Gaussian,  and  this  is  clearly  not  the  case  for  L2[a(X))  since  X2(t)  is  not  Gaus¬ 
sian  if  X(t)  is  non-degenerate  (it  has  a  scaled  chi-square  distribution),  but  X2(t)  is 


108 


square-integrable.  We  know  that  the  best  estimate  of  a  random  variable  Y  based  on 
knowledge  of  X  is  given  by  E(Y\X )  and  in  general  this  may  not  actually  be  in  Tix- 
However  for  the  Gaussian  case,  it  is  and  this  is  one  of  the  attractive  features  of  the 
model  of  Gaussian  processes. 

Remark.  One  assumption  in  Cambanis/Masry  (1983)  is  that  the  signal  function 
S  satisfies  that  there  exists  square  integrable  function  /  such  that 

J  R(t,s)f(s)ds  =  S(t).  (C  -  20) 

But  this  implies  that 

OO 

s(-)  =  !></>  *04  (c  -  21) 

k=  l 

so  that  S  =  Y,kak<t>k  where  {^=}fc  is  in  £2.  This  allows  for  the  log-likelihood  ratio 
statistic  based  on  knowing  all  of  X  and  after  ignoring  terms  not  dependent  on  the 
data  X ,  to  be  given  by 

X(t)f*{t)dty  (C  —  22) 
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Appendix  D  Properties  of  GSNRs 


In  this  section  we  discuss  in  greater  detail  some  optimality  results  mentioned  in 
subsections  A.l  and  A. 2,  and  which  are  related  to  choosing  resolutions  maximizing 
generalized  signal-to-noise  ratios  (GSNR).  These  results  were  used  to  justify  using  the 
criteria  of  finding  resolutions  which  maximized  the  generalized  signal-to-noise  ration 
(GSNR).  Essentially  we  are  quoting  results  from  Cambanis/Masry  (1983),  hereafter 
referred  to  as  CM  and  also  the  norm  used  in  this  appendix  refers  to  the  norm  in  the 
reproducing  kernel  Hilbert  space  as  defined  in  the  previous  appendix.  In  fact  in  this 
appendix,  we  will  use  all  the  notation  which  was  set  up  in  the  previous  appendix, 
especially  the  notation  for  projections.  In  CM  (1983),  it  was  shown  that  under  the 
assumptions  given  in  (al)  through  (a5),  have  for  any  set  of  resolutions  represented 
by  £  ,  that  the  corresponding  probabiliy  of  detection  is  given  by 

PoULJ  =  4KI|Pusll  -  ®_1(i  -  Pfa)).  (D  -  1) 

where  Ln  is  linear  subspace  generated  in  the  reproducing  kernel  Hilbert  space  by  the 
functions  {R(pj,  •)  :  1  <  j  <  n},  <j>  is  probability  density  function  of  standard  real 
Gaussian,  i.e., 

#*)=4_e-£  *eR  (D  —  2) 

and  $  is  the  cumulative  distribution  function  for  the  above  probability  density  func¬ 
tion,  i.e., 

$(*)=  f  <f>(z')  dz'  z  e  R.  (D  —  3) 

J—oo 

Note  that  this  implies  that  the  sequence  of  resolutions  which  maximize  the  GSNR 
(=||7r>|_TiiS'||)  also  maximize  the  probability  of  detection  PoUtn)-  Hence  the  justifi¬ 
cation  for  focusing  on  the  resolutions  which  maximize  GSNR.  Based  on  the  above 
result  in  (D-l),  it  is  an  straightforward  exercise  in  calculus  to  show  that  if  Pfn  is  an 
optimal  sequence  for  maximizing  probability  of  detection  or  equivalently  maximizing 
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the  GSNR,  then 


ct-p\„sF  -f  2||S|IVN’  -  *'1(1  -  M  •  (D  - 4) 

This  result  can  be  used  to  help  choose  the  constant  K  in  the  general  loss  function 
given  in  subsection  A. 2.  Also  it  can  be  used  to  derive  alternative  algorithms  for 
finding  resolutions  without  directly  trying  to  maximize  GSNR.  This  was  discussed  in 
some  detail  in  subsection  A.l  for  the  case  where  the  noise  process  had  independent 
increments.  Based  on  (D-4),  it  is  straightforward  to  define  the  notion  of  a  sequence  £* 
being  asymptotically  optimal.  Specifically  let  £  °  be  a  sequence  of  optimal  resolutions. 
We  say  a  sequence  of  vector  of  resolutions  P)  is  asymptotically  optimal  if  and  only  if 

Ill'll2  -  II^L^II2  m 

- =2 - j.  i  (D  -  5) 

Ill'll2  -  IIPl^II2 

where  L°  represents  the  linear  space  spanned  by  {R(pj,  •)  :  1  <  j  <  n}  and 
represents  the  linear  space  spanned  by  {R{p},  •)  :  1  <  j  <  n}.  For  a  further 
discussion  of  these  and  related  issues  see  CM  (1983). 
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